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HETEROLOGOUS EXPRESSION OF NEISSERIAL PROTEINS 

All documents cited herein are incorporated by reference in their entirety. 

TECHNICAL FIELD 

This invention is in the field of protein expression. In particular, it relates to the heterologous 
5 expression of proteins from Neisseria {e.g. N. gonorrhoeae or, preferably, N. meningitidis). 

BACKGROUND ART 

International patent applications W099/24578, W099/36544, WO99/57280 and 
WO00/22430 disclose proteins from Neisseria meningitidis and Neisseria gonorrhoeae. 
These proteins are typically described as being expressed in E.coli (i.e. heterologous 
10 expression) as either N-terminal GST-fusions or C-terminal His-tag fusions, although other 
expression systems, including expression in native Neisseria, are also disclosed. 

It is an object of the present invention to provide alternative and improved approaches for 
the heterologous expression of these proteins. These approaches will typically affect the 
level of expression, the ease of purification, the cellular localisation of expression, and/or the 
15 immunological properties of the expressed protein. 

DISCLOSURE OF THE INVENTION 

Nomenclature herein 

The 2166 protein sequences disclosed in W099/24578, W099/36544 and WO99/57280 are 
referred to herein by the following SEQ# numbers: 



Application 


Protein sequences 


SEQ# herein 


W099/24578 


Even SEQ IDs 2-892 


SEQ#s 1-446 


W099/36544 


Even SEQ IDs 2-90 


SEQ#s 447-491 




Even SEQ IDs 2-3020 


SEQ#s 492-2001 


WO99/57280 


Even SEQ IDs 3040-31 14 


SEQ#s 2002-2039 




SEQ IDs 3115-3241 


SEQ#s 2040-2166 



20 In addition to this SEQ# numbering, the naming conventions used in W099/24578, 
W099/36544 and WO99/57280 are also used (e.g. 'ORF4', 'ORF40', 'ORF40-1' etc. as 
used in W099/24578 and W099/36544; e m919\ 'g919' and 'a919' etc. as used in 
WO99/57280). 
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The 2160 proteins NMB0001 to KMB2160 from Tettelin et al [Science (2000) 287:1809- 
1815] are referred to herein as SEQ#s 2167-4326 [see also WO00/66791]. 

The term 'protein of the invention' as used herein refers to a protein comprising: 
(a) one of sequences SEQ#s 1-4326; or 
5 (b) a sequence having sequence identity to one of SEQ#s 1-4326; or 

(c) a fragment of one of SEQ#s 1-4326. 

The degree of 'sequence identity 5 referred to in (b) is preferably greater than 50% (eg, 60%, 
70%, 80%, 90%, 95%, 99% or more). This includes mutants and allelic variants [e.g. see 
WO00/66741]. Identity is preferably determined by the Smith-Waterman homology search 
10 algorithm as implemented in the MPSRCH program (Oxford Molecular), using an affine gap 
search with parameters gap open penalty =12 and gap extension penalty- L Typically, 50% 
identity or more between two proteins is considered to be an indication of functional 
equivalence. 

The 'fragment 5 referred to in (c) should comprise at least n consecutive amino acids from 
15 one of SEQ#s 1-4326 and, depending on the particular sequence, n is 7 or more (eg. 8, 10, 
12, 14, 16, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100 or more). Preferably the fragment 
comprises an epitope from one of SEQ#s 1-4326. Preferred fragments are those disclosed in 
WO00/71574 and WO01/04316. 

Preferred proteins of the invention are found in N. meningitidis serogroup B. 

20 Preferred proteins for use according to the invention are those of serogroup B N .meningitidis 
strain 2996 or strain 394/98 (a New Zealand strain). Unless otherwise stated, proteins 
mentioned herein are from N. meningitidis strain 2996. It will be appreciated, however, that 
the invention is not in general limited by strain. References to a particular protein (e.g. '287', 
'919 5 etc.) may be taken to include that protein from any strain. 

25 Non-fusion expression 

In a first approach to heterologous expression, no fusion partner is used, and the native 
leader peptide (if present) is used. This will typically prevent any 'interference 5 from fusion 
partners and may alter cellular localisation and/or post-translational modification and/or 
folding in the heterologous host. 
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Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) no fusion partner is used, and (b) the protein's native leader peptide 
(if present) is used. 

The method will typically involve the step of preparing an vector for expressing a protein of 
5 the invention, such that the first expressed amino acid is the first amino acid (methionine) of 
said protein, and last expressed amino acid is the last amino acid of said protein {i.e. the 
codon preceding the native STOP codon). 

This approach is preferably used for the expression of the following proteins using the native 
leader peptide: 111, 149, 206, 225-1, 235, 247-1, 274, 283, 286, 292, 401, 406, 502-1, 503, 
10 519-1, 525-1, 552, 556, 557, 570, 576-1, 580, 583, 664, 759, 907, 913, 920-1, 936-1, 953, 
961, 983, 989, Orf4, Orf7-l, Orf9-l, Orf23, Orf25, Orf37, Orf38, Orf40, Orf40.1, Orf40.2, 
Orf72~l, Orf76-l, Orf85-2, Orf91, Orf97-l, Orfll9, Orfl43.1, NMB0109 and NMB2050. 
The suffix 'L' used herein in the name of a protein indicates expression in this manner using 
the native leader peptide. 

15 Proteins which are preferably expressed using this approach using no fusion partner and 
which have no native leader peptide include: 008, 105, 117-1, 121-1, 122-1, 128-1, 148, 
216, 243, 308, 593, 652, 726, 926, 982, Orf83~l and Orfl43-l. 

Advantageously, it is used for the expression of ORF25 or ORF40, resulting in a protein 
which induces better anti-bactericidal antibodies than GST- or His-fusions. 

20 This approach is particularly suited for expressing lipoproteins. 
Leader ^peptide substitution 

In a second approach to heterologous expression, the native leader peptide of a protein of the 
invention is replaced by that of a different protein. In addition, it is preferred that no fusion 
partner is used. Whilst using a protein's own leader peptide in heterologous hosts can often 
25 localise the protein to its 'natural' cellular location, in some cases the leader sequence is not 
efficiently recognised by the heterologous host. In such cases, a leader peptide known to 
drive protein targeting efficiently can be used instead. 

Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) the protein's leader peptide is replaced by the leader peptide from a 
30 different protein and, optionally, (b) no fusion partner is used. 
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The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
the invention; manipulating said nucleic acid to remove nucleotides that encode the protein's 
leader peptide and to introduce nucleotides that encode a different protein's leader peptide. 
The resulting nucleic acid may be inserted into an expression vector, or may already be part 
5 of an expression vector. The expressed protein will consist of the replacement leader peptide 
at the N-terminus, followed by the protein of the invention minus its leader peptide. 

The leader peptide is preferably from another protein of the invention {e.g. one of SEQ#s 
1-4326), but may also be from an E.coli protein {e.g. the OmpA leader peptide) or an 
Erwinia carotovora protein {e.g. the PelB leader peptide), for instance. 

10 A particularly useful replacement leader peptide is that of ORF4. This leader is able to direct 
lipidation in E.coli, improving cellular localisation, and is particularly useful for the 
expression of proteins 287, 919 and AG287. The leader peptide and N-terminal domains of 
961 are also particularly useful. 

Another useful replacement leader peptide is that of E.coli OmpA. This leader is able to 
15 direct membrane localisation of E.coli. It is particularly advantageous for the expression of 
ORF1, resulting in a protein which induces better anti-bactericidal antibodies than both 
fusions and protein expressed from its own leader peptide. 

Another useful replacement leader peptide is MKKYLFSAA. This can direct secretion into 
culture medium, and is extremely short and active. The use of this leader peptide is not 
20 restricted to the expression of Neisserial proteins - it may be used to direct the expression of 
any protein (particularly bacterial proteins). 

Leader-peptide deletion 

In a third approach to heterologous expression, the native leader peptide of a protein of the 
invention is deleted. In addition, it is preferred that no fusion partner is used. 

25 Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) the protein's leader peptide is deleted and, optionally, (b) no fusion 
partner is used. 

The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
the invention; manipulating said nucleic acid to remove nucleotides that encode the protein's 
30 leader peptide. The resulting nucleic acid may be inserted into an expression vector, or may 
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already be part of an expression vector. The first amino acid of the expressed protein will be 
that of the mature native protein. 

This method can increase the levels of expression. For protein 919, for example, expression 
levels in E.coli are much higher when the leader peptide is deleted. Increased expression 
5 may be due to altered localisation in the absence of the leader peptide. 

The method is preferably used for the expression of 919, ORF46, 961, 050-1, 760 and 287. 
Domain-based expression 

In a fourth approach to heterologous expression, the protein is expressed as domains. This 
may be used in association with fusion systems {e.g. GST or His-tag fusions). 

10 Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) at least one domain in the protein is deleted and, optionally, (b) no 
fusion partner is used. 

The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
the invention; manipulating said nucleic acid to remove at least one domain from within the 
15 protein. The resulting nucleic acid may be inserted into an expression vector, or may already 
be part of an expression vector. Where no fusion partners are used, the first amino acid of the 
expressed protein will be that of a domain of the protein. 

A protein is typically divided into notional domains by aligning it with known sequences in 
databases and then determining regions of the protein which show different alignment 
20 patterns from each other. 

The method is preferably used for the expression of protein 287. This protein can be 
notionally split into three domains, referred to as A B & C (see Figure 5). Domain B aligns 
strongly with IgA proteases, domain C aligns strongly with transferrin-binding proteins, and 
domain A shows no strong alignment with database sequences. An alignment of 
25 polymorphic forms of 287 is disclosed in WO00/6674 1 . 

Once a protein has been divided into domains, these can be (a) expressed singly (b) deleted 
from with the protein e.g. protein ABCD — > ABD, ACD, BCD etc. or (c) rearranged e.g. 
protein ABC — > ACB, CAB etc. These three strategies can be combined with fusion partners 
is desired. 
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ORF46 has also been notionally split into two domains - a first domain (amino acids 1-433) 
which is well-conserved between species and serogroups, and a second domain (amino acids 
433-608) which is not well-conserved. The second domain is preferably deleted. An 
alignment of polymorphic forms of ORF46 is disclosed in WO00/66741. 

5 Protein 564 has also been split into domains (Figure 8), as have protein 961 (Figure 12) and 
protein 502 (amino acids 28-167 of the MC58 protein). 

Hybrid proteins 

In a fifth approach to heterologous expression, two or more {e.g. 3, 4, 5, 6 or more) proteins 
of the invention are expressed as a single hybrid protein. It is preferred that no 
10 non-Neisserial fusion partner (e.g. GST or poly-His) is used. 

This offers two advantages. Firstly, a protein that may be unstable or poorly expressed on its 
own can be assisted by adding a suitable hybrid partner that overcomes the problem. 
Secondly, commercial manufacture is simplified - only one expression and purification need 
be employed in order to produce two separately-useful proteins. 

15 Thus the invention provides a method for the simultaneous heterologous expression of two 
or more proteins of the invention, in which said two or more proteins of the invention are 
fused (i.e. they are translated as a single polypeptide chain). 

The method will typically involve the steps of: obtaining a first nucleic acid encoding a first 
protein of the invention; obtaining a second nucleic acid encoding a second protein of the 
20 invention; ligating the first and second nucleic acids. The resulting nucleic acid may be 
inserted into an expression vector, or may already be part of an expression vector. 

Preferably, the constituent proteins in a hybrid protein according to the invention will be 
from the same strain. 

The fused proteins in the hybrid may be joined directly, or may be joined via a linker peptide 
25 e.g. via a poly-glycine linker (i.e. G n where n - 3, 4, 5, 6, 7, 8, 9, 10 or more) or via a short 
peptide sequence which facilitates cloning. It is evidently preferred not to join a AG protein 
to the C-terminus of a poly-glycine linker. 

The fused proteins may lack native leader peptides or may include the leader peptide 
sequence of the N-terminal fusion partner. 
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The method is well suited to the expression of proteins orfl, orf4, orf25, orf40, Orf46/46.1, 
orf83, 233, 287, 292L, 564, 687, 741, 907, 919, 953, 961 and 983. 

The 42 hybrids indicated by 'X' in the following table of form NH 2 -A— R-COOH are 
preferred: 





ORF46.1 


287 


741 


919 


953 


961 


983 


ORF46.1 




X 


X 


X 


X 


X 


X 


287 


X 




X 


X 


X 


X 


X 


741 


X 


X 




X 


X 


X 


X 


919 


X 


X 


X 




X 


X 


X 


953 


X 


X 


X 


X 




X 


X 


961 


X 


X 


X 


X 


X 




X 


983 


X 


X 


X 


X 


X 


X 





5 Preferred proteins to be expressed as hybrids are thus ORF46.1, 287, 741, 919, 953, 961 and 
983. These may be used in their essentially full-length form, or poly-glycine deletions (AG) 
forms may be used (e.g. AG-287, AGTbp2, AG741, AG983 etc.), or truncated forms may be 
used (e.g. Al-287, A2-287 etc.), or domain-deleted versions may be used (e.g. 287B, 287C, 
287BC, ORF46i_433, ORF46 43 3-608, ORF46, 961c etc.). 

10 Particularly preferred are: (a) a hybrid protein comprising 919 and 287; (b) a hybrid protein 
comprising 953 and 287; (c) a hybrid protein comprising 287 and ORF46.1; (d) a hybrid 
protein comprising ORF1 and ORF46.1; (e) a hybrid protein comprising 919 and ORF46.1; 
(f) a hybrid protein comprising ORF46.1 and 919; (g) a hybrid protein comprising ORF46.1, 
287 and 919; (h) a hybrid protein comprising 919 and 519; and (i) a hybrid protein 

15 comprising ORF97 and 225. Further embodiments are shown in Figure 14. 

Where 287 is used, it is preferably at the C-terminal end of a hybrid; if it is to be used at the 
N-terminus, if is preferred to use a AG form of 287 is used (e.g. as the N-terminus of a 
hybrid with ORF46.1, 919, 953 or 961). 

Where 287 is used, this is preferably from strain 2996 or from strain 394/98. 

20 Where 961 is used, this is preferably at the N-terminus. Domain forms of 961 may be used. 

Alignments of polymorphic forms of ORF46, 287, 919 and 953 are disclosed in 
WO00/66741. Any of these polymorphs can be used according to the present invention. 
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Temperature 

In a sixth approach to heterologous expression, proteins of the invention are expressed at a 
low temperature. 

Expressed Neisserial proteins {e.g. 919) may be toxic to E.coli, which can be avoided by 
5 expressing the toxic protein at a temperature at which its toxic activity is not manifested. 

Thus the present invention provides a method for the heterologous expression of a protein of 
the invention, in which expression of a protein of the invention is carried out at a 
temperature at which a toxic activity of the protein is not manifested. 

A preferred temperature is around 30°C. This is particularly suited to the expression of 919. 
10 Mutations 

As discussed above, expressed Neisserial proteins may be toxic to E.coli. This toxicity can 
be avoided by mutating the protein to reduce or eliminate the toxic activity. In particular, 
mutations to reduce or eliminate toxic enzymatic activity can be used, preferably using site- 
directed mutagenesis. 

15 In a seventh approach to heterologous expression, therefore, an expressed protein is mutated 
to reduce or eliminate toxic activity. 

Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which protein is mutated to reduce or eliminate toxic activity. 

The method is preferably used for the expression of protein 907, 919 or 922. A preferred 
20 mutation in 907 is at Glu-117 (e.g. Glu— >Gly); preferred mutations in 919 are at Glu-255 
(e.g. Glu— >Gly) and/or Glu-323 (e.g. Glu— >Gly); preferred mutations in 922 are at Glu-164 
(e.g. Glu-+Gly) ? Ser-213 (e.g. Ser->Gly) and/or Asn-348 (e.g. Asn-*jly). 

Alternative vectors 

In a eighth approach to heterologous expression, an alternative vector used to express the 
25 protein. This may be to improve expression yields, for instance, or to utilise plasmids that are 
already approved for GMP use. 

Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which an alternative vector is used. The alternative vector is preferably 
pSM214, with no fusion partners. Leader peptides may or may not be included. 
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This approach is particularly useful for protein 953. Expression and localisation of 953 with 
its native leader peptide expressed from pSM214 is much better than from the pET vector. 

pSM214 may also be used with: AG287, A2-287, A3-287, A4-287, Orf46.1, 961L, 961, 
961(MC58), 961c, 961c-L, 919, 953 and AG287-Orf46.1. 

5 Another suitable vector is pET-24b (Novagen; uses kanamycin resistance), again using no 
fusion partners. pET-24b is preferred for use with: AG287K, A2-287K, A3-287K, A4-287K, 
Orf46.1-K, Orf46A-K, 961-K (MC58), 961a-K, 961b~K, 961c-K, 961c-L-K, 961d-K, 
AG287-919-K, AG287-Orf46.1-K and AG287-961-K. 

Multimeric form 

10 In a ninth approach to heterologous expression, a protein is expressed or purified such that it 
adopts a particular multimeric form. 

This approach is particularly suited to protein 953. Purification of one particular multimeric 
form of 953 (the monomeric form) gives a protein with greater bactericidal activity than 
other forms (the dimeric form). 

15 Proteins 287 and 919 may be purified in dimeric forms. 

Protein 961 may be purified in a 180kDa oligomeric form (e.g. a tetramer). 

Lipidation 

In a tenth approach to heterologous expression, a protein is expressed as a lipidated protein. 

Thus the invention provides a method for the heterologous expression of a protein of the 
20 invention, in which the protein is expressed as a lipidated protein. 

This is particularly useful for the expression of 919, 287, ORF4, 406, 576-1, and ORF25. 
Polymorphic forms of 919, 287 and ORF4 are disclosed in WO00/66741. 

The method will typically involve the use of an appropriate leader peptide without using an 
N-terminal fusion partner. 

25 C-terminal deletions 

In an eleventh approach to heterologous expression, the C-terminus of a protein of the 
invention is mutated. In addition, it is preferred that no fusion partner is used. 
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Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) the protein's C-terminus region is mutated and, optionally, (b) no 
fusion partner is used. 

The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
5 the invention; manipulating said nucleic acid to mutate nucleotides that encode the protein's 
C-terminus portion. The resulting nucleic acid may be inserted into an expression vector, or 
may already be part of an expression vector. The first amino acid of the expressed protein 
will be that of the mature native protein. 

The mutation may be a substitution, insertion or, preferably, a deletion. 

10 This method can increase the levels of expression, particularly for proteins 730, ORF29 and 
ORF46. For protein 730, a C-terminus region of around 65 to around 214 amino acids may 
be deleted; for ORF46, the C-terminus region of around 175 amino acids may be deleted; for 
ORF29, the C-terminus may be deleted to leave around 230-370 N- terminal amino acids. 

Leader peptide mutation 

15 In a twelfth approach to heterologous expression, the leader peptide of the protein is 
mutated. This is particularly useful for the expression of protein 919. 

Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which the protein's leader peptide is mutated. 

The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
20 the invention; and manipulating said nucleic acid to mutate nucleotides within the leader 
peptide. The resulting nucleic acid may be inserted into an expression vector, or may already 
be part of an expression vector. 

Poly-glycine deletion 

In a thirteenth approach to heterologous expression, poly-glycine stretches in wild-type 
25 sequences are mutated. This enhances protein expression. 

The poly-glycine stretch has the sequence (Gly) n , where n>4 (e.g. 5, 6, 7, 8, 9 or more). This 
stretch is mutated to disrupt or remove the (Gly) n . This may be by deletion (e.g. CGGGGS— ^ 
CGGGS, CGGS, CGS or CS), by substitution (e.g. CGGGGS— * CGXGGS, CGXXGS, 
CGXGXS etc.), and/or by insertion (e.g. CGGGGS-> CGGXGGS, CGXGGGS, etc.). 
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This approach is not restricted to Neisserial proteins - it may be used for any protein 
(particularly bacterial proteins) to enhance heterologous expression. For Neisserial proteins, 
however, it is particularly suitable for expressing 287, 741, 983 and Tbp2. An alignment of 
polymorphic forms of 287 is disclosed in WO00/66741. 

5 Thus the invention provides a method for the heterologous expression of a protein of the 
invention, in which (a) a poly-glycine stretch within the protein is mutated. 

The method will typically involve the steps of: obtaining nucleic acid encoding a protein of 
the invention; and manipulating said nucleic acid to mutate nucleotides that encode a poly- 
glycine stretch within the protein sequence. The resulting nucleic acid may be inserted into 
10 an expression vector, or may already be part of an expression vector. 

Conversely, the opposite approach (Le. introduction of poly-glycine stretches) can be used to 
suppress or diminish expression of a given heterologous protein. 

Heterologous host 

Whilst expression of the proteins of the invention may take place in the native host (i.e. the 
15 organism in which the protein is expressed in nature), the present invention utilises a 
heterologous host. The heterologous host may be prokaryotic or eukaryotic. It is preferably 
E.coli, but other suitable hosts include Bacillus subtilis, Vibrio cholerae, Salmonella typhi, 
Salmonenna typhimurium, Neisseria meningitidis, Neisseria gonorrhoeae, Neisseria 
lactamica, Neisseria- cinerea, Mycobateria (e.g. M tuberculosis), yeast etc. 

20 Vectors etc. 

As well as the methods described above, the invention provides (a) nucleic acid and vectors 
useful in these methods (b) host cells containing said vectors (c) proteins expressed or 
expressable by the methods (d) compositions comprising these proteins, which may be 
suitable as vaccines, for instance, or as diagnostic reagents, or as immunogenic compositions 
25 (e) these compositions for use as medicaments (e.g. as vaccines) or as diagnostic reagents (f) 
the use of these compositions in the manufacture of (1) a medicament for treating or 
preventing infection due to Neisserial bacteria (2) a diagnostic reagent for detecting the 
presence of Neisserial bacteria or of antibodies raised against Neisserial bacteria, and/or (3) a 
reagent which can raise antibodies against Neisserial bacteria and (g) a method of treating a 
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patient, comprising administering to the patient a therapeutically effective amount of these 
compositions. 

Sequences 

The invention also provides a protein or a nucleic acid having any of the sequences set out in 
5 the following examples. It also provides proteins and nucleic acid having sequence identity 
to these. As described above, the degree of 'sequence identity' is preferably greater than 
50% (eg. 60%, 70%, 80%, 90%, 95%, 99% or more). 

Furthermore, the invention provides nucleic acid which can hybridise to the nucleic acid 
disclosed in the examples, preferably under "high stringency" conditions (eg. 65°C in a 
10 O.lxSSC, 0.5% SDS solution). 

The invention also provides nucleic acid encoding proteins according to the invention. 

It should also be appreciated that the invention provides nucleic acid comprising sequences 
complementary to those described above (eg. for antisense or probing purposes). 

Nucleic acid according to the invention can, of course, be prepared in many ways (eg. by 
15 chemical synthesis, from genomic or cDNA libraries, from the organism itself etc.) and can 
take various forms (eg, single stranded, double stranded, vectors, probes etc.). 

In addition, the term "nucleic acid" includes DNA and RNA, and also their analogues, such 
as those containing modified backbones, and also peptide nucleic acids (PNA) etc. 

BRIEF DESCRIPTION OF DRAWINGS 

20 Figures 1 and 2 show constructs used to express proteins using heterologous leader peptides. 

Figure 3 shows expression data for ORF1, and Figure 4 shows similar data for protein 961. 
Figure 5 shows domains of protein 287, and Figures 6 & 7 show deletions within domain A. 
Figure 8 shows domains of protein 564. 

Figure 9 shows the PhoC reporter gene driven by the 919 leader peptide, and Figure 10 
25 shows the results obtained using mutants of the leader peptide. 

Figure 11 shows insertion mutants of protein 730 (A: 730-C1; B: 730-C2). 

Figure 12 shows domains of protein 961. 
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Figure 13 shows SDS-PAGE of AG proteins. Dots show the main recombinant product. 
Figure 14 shows 26 hybrid proteins according to the invention. 

MODES FOR CARRYING OUT THE INVENTION 

Example 1 - 919 and its leader peptide 
5 Protein 919 from N. meningitidis (serogroup B, strain 2996) has the following sequence: 

1 MKKYLFRAAL YGIAAAILAA CQSKSIQTFP QPDTSVINGP DRPVGIPDPA 

51 GTTVGGGGAV YTWPHLSLP HWAAQDFAKS LQSFRLGCAN LKNRQGWQDV 

101 CAQAFQTPVH SFQAKQFFER YFTPWQVAGN GSLAGTVTGY YEPVLKGDDR 

151 RTAQARFPIY GIPDDFISVP LPAGLRSGKA LVRIRQTGKN SGTIDNTGGT 

10 201 HTADLSRFPI TARTTAIKGR FEGSRFLPYH TRNQINGGAL DGKAPILGYA 

2 51 EDPVELFFMH IQGSGRLKTP SGKYIRIGYA DKNEHPYVSI GRYMADKGYL 

3 01 KLGQT SMQGI KAYMRQNPQR LAEVLGQNPS YIFFRELAGS SNDGPVGALG 
3 51 TPLMGEYAGA VDRHYITLGA PLFVATAHPV TRKALNRLIM AQDTGSAIKG 
401 AVRVDYFWGY GDEAGELAGK QKTTGYVWQL LPNGMKPEYR P* 

15 The leader peptide is underlined. 

The sequences of 919 from other strains can be found in Figures 7 and 18 of WOOO/66741. 

Example 2 of WO99/57280 discloses the expression of protein 919 as a His-fusion in E.coli. 
The protein is a good surface-exposed immunogen. 

Three alternative expression strategies were used for 919: 
20 1) 919 without its leader peptide (and without the mature N-terminal cysteine) and 

without any fusion partner ('9l9 unte ss ed ') : 

1 QSKSIQTFP QPDTSVINGP DRPVGIPDPA GTTVGGGGAV YTWPHLSLP 

50 HWAAQDFAKS LQSFRLGCAN LKNRQGWQDV CAQAFQTPVH SFQAKQFFER 

100 YFTPWQVAGN GSLAGTVTGY YEPVLKGDDR RTAQARFPIY GIPDDFISVP 

25 150 LPAGLRSGKA LVRIRQTGKN SGTIDNTGGT HTADLSRFPI TARTTAIKGR 

200 FEGSRFLPYH TRNQINGGAL DGKAPILGYA EDPVELFFMH IQGSGRLKTP 

250 SGKYIRIGYA DKNEHPYVS I GRYMADKGYL KLGQT SMQGI KAYMRQNPQR 

3 00 LAEVLGQNPS YIFFRELAGS SNDGPVGALG TPLMGEYAGA VDRHYITLGA 

350 PLFVATAHPV TRKALNRLIM AQDTGSAIKG AVRVDYFWGY GDEAGELAGK 

30 400 QKTTGYVWQL LPNGMKPEYR P* 

The leader peptide and cysteine were omitted by designing the 5 -end amplification 
primer downstream from the predicted leader sequence. 

2) 919 with its own leader peptide but without any fusion partner ('919L'); and 
35 3) 919 with the leader peptide (mktffktlsaaalalilaa) from ORF4 ('919LOrf4'). 

1 MKTFFKTLS AAALALILAA CQSKSIQTFP QPDTSVINGP DRPVGIPDPA 

50 GTTVGGGGAV YTWPHLSLP HWAAQDFAKS LQSFRLGCAN LKNRQGWQDV 

100 CAQAFQTPVH SFQAKQFFER YFTPWQVAGN GSLAGTVTGY YEPVLKGDDR 

150 RTAQARFPIY GIPDDFISVP LPAGLRSGKA LVRIRQTGKN SGTIDNTGGT 

40 2 00 HTADLSRFPI TARTTAIKGR FEGSRFLPYH TRNQINGGAL DGKAPILGYA 

250 EDPVELFFMH IQGSGRLKTP SGKYIRIGYA DKNEHPYVS I GRYMADKGYL 

3 00 KLGQT SMQGI KSYMRQNPQR LAEVLGQNPS YIFFRELAGS SNDGPVGALG 
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350 TPLMGEYAGA VDRHYITLGA PLFVATAHPV TRKALNRLIM AQDTGSAIKG 
400 AVRVDYFWGY GDEAGELAGK QKTTGYVWQL LPNGMKPEYR P* 

To make this construct, the entire sequence encoding the ORF4 leader peptide was 
included in the 5'-primer as a tail (primer 919Lorf4 For). A Nhel restriction site was 
generated by a double nucleotide change in the sequence coding for the ORF4 leader 
(no amino acid changes), to allow different genes to be fused to the ORF4 leader 
peptide sequence. A stop codon was included in all the 3-end primer sequences. 

All three forms of the protein were expressed and could be purified. 

The '919L' and 4 919LOrf4' expression products were both lipidated, as shown by the 
incorporation of [ 3 H]-palmitate label. 919 imtagged did not incorporate the 3 H label and was 
located intracellular^. 

919LOrf4 could be purified more easily than 919L. It was purified and used to immunise 
mice. The resulting sera gave excellent results in FACS and ELISA tests, and also in the 
bactericidal assay. The lipoprotein was shown to be localised in the outer membrane. 

919 untagged gaye excelletlt ELISA titres and high serum bactericidal activity. FACS confirmed 
its cell surface location. 

Example 2 - 919 and expression temperature 

Growth of Kcoli expressing the 919LOrf4 protein at 37°C resulted in lysis of the bacteria. In 
order to overcome this problem, the recombinant bacteria were grown at 30°C. Lysis was 
prevented without preventing expression. 

Example 3 - mutation of 907, 919 and 922 

It was hypothesised that proteins 907, 919 and 922 are murein hydrolases, and more 
particularly lytic transglycosylases. Murein hydrolases are located on the outer membrane 
and participate in the degradation of peptidoglycan. 

The purified proteins 919 untagged , 919Lorf4, 919-His (i.e. with a C-terminus His-tag) and 
922-His were thus tested for murein hydrolase activity [Ursinus & Holtje (1994) J.Bact. 
176:338-343]. Two different assays were used, one determining the degradation of insoluble 
murein sacculus into soluble muropeptides and the other measuring breakdown of 
poly(MurNAc-GlcNAc) n>30 glycan strands. 
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The first assay uses murein sacculi radiolabeled with meso-2,6-diamino-3,4,5-[ 3 H]pimelic 
acid as substrate. Enzyme (3-10 jag total) was incubated for 45 minutes at 37°C in a total 
volume of lOOpl comprising lOmM Tris-maleate (pH 5.5), lOmM MgCl 2 , 0.2% v/v Triton 

X-100 and [ 3 H]A 2 pm labelled murein sacculi (about lOOOOcpm). The assay mixture was 
5 placed on ice for 15 minutes with 100 jliI of 1% w/v N-acetyl-N,N,N-trimethylammonium for 
15 minutes and precipitated material pelleted by centrifugation at 10000g for 15 minutes. 
The radioactivity in the supernatant was measured by liquid scintillation counting. E.coli 
soluble lytic transglycosylase Slt70 was used as a positive control for the assay; the negative 
control comprised the above assay solution without enzyme. 

10 All proteins except 919-His gave positive results in the first assay. 

The second assay monitors the hydrolysis of poly(MurNAc-GlcNAc)glycan strands. Purified 
strands, poly(MurNAc-GlcNAc) n >30 labelled with N-acetyl-D-l-[ 3 H] glucosamine were 
incubated with 3|ug of 919L in 10 mM Tris-maleate (pH 5.5), 10 mM MgCl 2 and 0.2% v/v 
Triton X-100 for 30 min at 37°C. The reaction was stopped by boiling for 5 minutes and the 
15 pH of the sample adjusted to about 3.5 by addition of l*0|il of 20% v/v phosphoric acid. 
Substrate and product were separated by reversed phase HPLC on a Nucleosil 300 C 18 
column as described by Harz et al [Anal Biochem. (1990) 190:120-128]. The E.coli lytic 
transglycosylase Mlt A was used as a positive control in the assay. The negative control was 
performed in the absence of enzyme. 

20 By this assay, the ability of 919LOrf4 to hydrolyse isolated glycan strands was demonstrated 
when anhydrodisaccharide subunits were separated from the oligosaccharide by HPLC. 

Protein 919Lorf4 was chosen for kinetic analyses. The activity of 919Lorf4 was enhanced 
3.7-fold by the addition of 0.2% v/v Triton X-100 in the assay buffer. The presence of Triton 
X-100 had no effect on the activity of 9l9 unta ss ed > The effect of pH on enzyme activity was 

25 determined in Tris-Maleate buffer over a range of 5.0 to 8.0. The optimal pH for the reaction 
was determined to be 5.5. Over the temperature range 18°C to 42°C, maximum activity was 
observed at 37°C. The effect of various ions on murein hydrolase activity was determined by 
performing the reaction in the presence of a variety of ions at a final concentration of lOmM. 
Maximum activity was found with Mg 2 + which stimulated activity 2.1-fold. Mn 2+ and Ca 2+ 

30 also stimulated enzyme activity to a similar extent while the addition Ni 2+ and EDTA had no 
significant effect. In contrast, both Fe 2+ and Zn 2+ significantly inhibited enzyme activity. 
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The structures of the reaction products resulting from the digestion of unlabelled E.coli 
murein sacculus were analysed by reversed-phase HPLC as described by Glauner [Anal 
Biochem. (1988) 172:451-464], Murein sacculi digested with the muramidase Cellosyl were 
used to calibrate and standardise the Hypersil ODS column. The major reaction products 
5 were 1,6 anhydrodisaccharide tetra and tri peptides, demonstrating the formation of 1,6 
anhydromuraminic acid intramolecular bond. 

These results demonstrate experimentally that 919 is a murein hydrolase and in particular a 
member of the lytic transglycosylase family of enzymes. Furthermore the ability of 922-His 
to hydrolyse murein sacculi suggests this protein is also a lytic transglycosylase. 

10 This activity may help to explain the toxic effects of 919 when expressed in E.coli. 

In order to eliminate the enzymatic activity, rational mutagenesis was used. 907, 919 and 
922 show fairly low homology to three membrane-bound lipidated murein lytic 
transglycosylases from E.coli: 

919 (441aa) is 27.3% identical over 440aa overlap to E.coli MLTA (P46885); 
15 922 (369aa) is 38.7% identical over 3 lOaa overlap to E.coli MLTB (P41052); and 

907-2 (207aa) is 26.8% identical over 149aa overlap to E.coli MLTC (P52066). 

907-2 also shares homology with E.coli MLTD (P23931) and Slt70 (P03810), a soluble lytic 
transglycosylase that is located in the periplasmic space. No significant sequence homology 
can be detected among 919, 922 and 907-2, and the same is true among the corresponding 
20 MLTA, MLTB and MLTC proteins. 

Crystal structures are available for Slt70 [1QTEA; 1QTEB; Thunnissen et al. (1995) 
Biochemistry 34:12729-12737] and for Slt35 [1LTM; 1QUS; 1QUT; van Asselt et al (1999) 
Structure Fold Des 7: 1 167-80] which is a soluble form of the 40kDa MLTB. 

The catalytic residue (a glutamic acid) has been identified for both Slt70 and MLTB. 

25 In the case of Slt70, mutagenesis studies have demonstrated that even a conservative 
substitution of the catalytic Glu505 with a glutamine (Gin) causes the complete loss of 
enzymatic activity. Although Slt35 has no obvious sequence similarity to Slt70, their 
catalytic domains shows a surprising similarity. The corresponding catalytic residue in 
MLTB is Glul62. 
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Another residue which is believed to play an important role in the correct folding of the 
enzymatic cleft is a well-conserved glycine (Gly) downstream of the glutamic acid. 
Recently, Terrak et al [MolMicrobiol (1999) 34:350-64] have suggested the presence of 
another important residue which is an aromatic amino acid located around 70-75 residues 
5 downstream of the catalytic glutamic acid. 

Sequence alignment of Slt70 with 907-2 and of MLTB with 922 were performed in order to 
identify the corresponding catalytic residues in the MenB antigens. 

The two alignments in the region of the catalytic domain are reported below: 

907-2/Slt70: 

10 90 100 110 T120 130 140 

907-2 .pep ERRRLLWIQYESSRAG — LDTQ I VLGL I EVE S AFRQYAI SGVGARGLMQVMPFWKNYIG 

||||:: :| : : :::: : |||: : | |J| Tllhll - 
S 1 ty_ecol i ERFPLAYNDLFKRYTSGKE IPQSYAMAI ARQESAWNPKVKS PVGASGLMQ IMPGTATHTV 
480 490 500 ▲ 510 520 530 

15 GLU505 

922/MLTB 



20 



150 160 T 170 180 190 200 

922. pep VAQK YGVP AEL» I VAV I G I ETNYGKNTG S F RVADAIj ATLGFDY PRRAGFFQKE LVE liLKL A 

= I Mil hli::)|:|l = T: h 1 = I I I I I I : I = I 1 I I I = h |] :| :| 
ml tb_ec o 1 i AWQVYGVPPE I IVGI IGVETRWGRVMGKTRILDALATLSFNYPRRAEYFSGELETFLLMA 
150 160 A 170 180 190 200 

GLU162 



25 210 220 230 240 250 260 

92 2 . pep KEEGGDVFAFKGSYAGAMGMPQFMPSSYRKWAVDYDGDGHRDIWGNVGDVAASVANYMKQ 

-I I = =111=11111= IIMIlT:::ll|::IIII -I I 1 = =11111=1 
mltb_ecoli RDEQDDPLNLKGSFAGAMGYGQFMPSSYKQYAVDFSGDGHINLWDPV-DAIGSVANYFKA 
210 220 230 240 250 260 

30 

From these alignments, it results that the corresponding catalytic glutamate in 907-2 is 
Glull7, whereas in 922 is Glul64. Both antigens also share downstream glycines that could 
have a structural role in the folding of the enzymatic cleft (in bold), and 922 has a conserved 
aromatic residue around 70aa downstream (in bold). 

35 In the case of protein 919, no 3D structure is available for its E.coli homologue MLTA, and 
nothing is known about a possible catalytic residue. Nevertheless, three amino acids in 919 
are predicted as catalytic residues by alignment with MLTA: 

919/MLTA 

240 250 T 260 □ □ 270 □ 280 290 

40 919 .pep ALDGKAPILGYAEDPVELFFMHIQGSGRLKTPSGKYIRI-GYADKNEHPYVSIGRYMADK 

11= I II:]::: == hi =1111 = =1= : = =11 II I I 111= = 1= 
mlta_ecoli .p ALSDKY-ILAYSNSLMDNFIMDVQGSGYIDFGDGSPLNFFSYAGKNGHAYRSIGKVLIDR 

170 180 190 200 210 
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919 .pep 



300 310 320 ▼ 330O □□ 340 0350 0 

GYLKLGQTSMQGIKSYMRQNPQ-RDAEVLGQNPSYIFFRELAGSSNDGPV-GALGTPLMG 



5 



mlta_ecoli .p GEVKKEDMSMQAIRHWGETHSEAEVRELLEQNPSFVFFKPQSFA - PVKGASAVPLVG 

220 230 240 250 260 270 



360 T o 380 390 400 00410 

9 1 9 . pep E YAGAVDRHY I T LGAP LFVAT AH P VTRKALN RLIMAQDTGSAIKGAVRVDYFWGY 



10 



mlta_ecoli .p RASVASDRSIIPPGTTLLAEVPLLDNNGKFNGQYELRLMVALDVGGAIKGQ-HFDIYQGI 
280 290 300 310 320 330 



15 



420 o 
919. pep GDEAGELAGKQKTTGYVWQLLP 




340 350 



The three possible catalytic residues are shown by the symbol T : 

20 1) Glu255 (Asp in MLTA), followed by three conserved glycines (Gly263, Gly265 and 
Gly272) and three conserved aromatic residues located approximately 75-77 residues 
downstream. These downstream residues are shown by □. 

2) Glu323 (conserved in MLTA), followed by 2 conserved glycines (Gly347 and Gly355) 
and two conserved aromatic residues located 84-85 residues downstream (Tyr406 or 

25 Phe407). These downstream residues are shown by 0. 

3) Asp362 (instead of the expected Glu), followed by one glycine (Gly 369) and a 
conserved aromatic residue (Trp428). These downstream residues are shown by o. 

Alignments of polymorphic forms of 919 are disclosed in WO00/66741. 

Based on the prediction of catalytic residues, three mutants of the 919 and one mutant of 
30 907, containing each a single amino acid substitution, have been generated. The glutamic 
acids in position 255 and 323 and the aspartic acids in position 362 of the 919 protein and 
the glutamic acid in position 117 of the 907 protein, were replaced with glycine residues 
using PCR-based SDM. To do this, internal primers containing a codon change from Glu or 
Asp to Gly were designed: 
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Primers 


Sequences 


Codon change 


919-E255 for 
919-E255 rev 


CGAAGACCCCGTCGgtCTTTTTTTTATG 
GTGCATAAAAAAAAGacCGACGGGGTCT 


GAA -> Ggt 


919-E323 for 
919-E323 rev 


AACGCCTCGCCGgtGTTTTGGGTCA 
TTTGACCCAAAACacCGGCGAGGCG 


G A A -> Ggt 


919-D362 for 
919-D362 rev 


TGCCGGCGCAGTCGgtCGGCACTACA 
TAATGTAGTGCCGacCGACTGCGCCG 


GAC -» Ggt 


907-E117for 
907-E117rev 


TGATTGAGGTGGgtAGCGCGTTCCG 
GGCGGAACGCGCTacCCACCTCAAT 


GAA -> Ggt 



Underlined nucleotides code for glycine; the mutated nucleotides are in lower case. 



To generate the 919-E255, 919-E323 and 919-E362 mutants, PCR was performed using 
20ng of the pET 919-LOrf4 DNA as template, and the following primer pairs: 

1) Orf4L for / 919-E255 rev 

2) 919-E255for/919Lrev 

3) Orf4L for / 919-E323 rev 

4) 919-E323 for / 919L rev 

5) Orf4L for / 919-D362 rev 

6) 919-D362for/919Lrev 

The second round of PCR was performed using the product of PCR 1-2, 3-4 or 5-6 as 
template, and as forward and reverse primers the "Orf4L for" and M 919L rev" respectively. 

For the mutant 907-E1 17, PCR have been performed using 200ng of chromosomal DNA of 
the 2996 strain as template and the following primer pairs: 

7) 907L for / 907-E1 17 rev 

8) 907-E117for/907Lrev 

The second round of PCR was performed using the products of PCR 7 and 8 as templates 
and the oligos "907L for" and M 907L rev" as primers. 

The PCR fragments containing each mutation were processed following the standard 
procedure, digested with Ndel and Xhol restriction enzymes and cloned into pET-21b+ 
vector. The presence of each mutation was confirmed by sequence analysis. 

Mutation of Glull7 to Gly in 907 is carried out similarly, as is mutation of residues Glul64, 
Ser213 and Asn348 in 922. 
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The E255G mutant of 919 shows a 50% reduction in activity; the E323G mutant shows a 
70% reduction in activity; the E362G mutant shows no reduction in activity. 

Example 4 - multimeric form 

287-GST, 919 untagged and 953-His were subjected to gel filtration for analysis of quaternary 
5 structure or preparative purposes. The molecular weight of the native proteins was estimated 
using either FPLC Superose 12 (H/R 10/30) or Superdex 75 gel filtration columns 
(Pharmacia). The buffers used for chromatography for 287, 919 and 953 were 50 mM Tris- 
HC1 (pH 8.0), 20 mM Bicine (pH 8.5) and 50 mM Bicine (pH 8.0), respectively. 

Additionally each buffer contained 150-200 mM NaCl and 10% v/v glycerol. Proteins were 
10 dialysed against the appropriate buffer and applied in a volume of 200|ll1. Gel filtration was 
performed with a flow rate of 0.5 - 2.0 ml/min and the eluate monitored at 280nm. Fractions 
were collected and analysed by SDS-PAGE. Blue dextran 2000 and the molecular weight 
standards ribonuclease A, chymotrypsin A ovalbumin, albumin (Pharmacia) were used to 
calibrate the column. The molecular weight of the sample was estimated from a calibration 
15 curve of K av vs. log M r of the standards. Before gel filtration, 287-GST was digested with 

thrombin to cleave the GST moiety. 

The estimated molecular weights for 287, 919 and 953-His were 73 kDa, 47 kDa and 43 kDa 
respectively. These results suggest 919 is monomeric while both 287 and 953 are principally 
dimeric in their nature. In the case of 953-His, two peaks were observed during gel filtration. 
20 The major peak (80%) represented a dimeric conformation of 953 while the minor peak 
(20%) had the expected size of a monomer. The monomeric form of 953 was found to have 
greater bactericidal activity than the dimer. 

Example 5 - pSM214 and pET-24b vectors 

953 protein with its native leader peptide and no fusion partners was expressed from the pET 
25 vector and also from pSM214 [Velati Bellini etal (1991) /. BiotecknoL 18, 177-192]. 

The 953 sequence was cloned as a full-length gene into pSM214 using the E. coli MM294-1 
strain as a host. To do this, the entire DNA sequence of the 953 gene (from ATG to the 
STOP codon) was amplified by PCR using the following primers: 

953L for/2 CCGGAATTCTTATGAAAAAAATCATCTTCGCCGC Eco RI 

30 953L rev/2 GCCCAAGCTTTTATTGTTTGGCTGCCTCGATT Hind HI 
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which contain EcoRI and HindlH restriction sites, respectively. The amplified fragment was 
digested with EcoRI and HindlH and ligated with the pSM214 vector digested with the same 
two enzymes. The ligated plasmid was transformed into E.coli MM294-1 cells (by 
incubation in ice for 65 minutes at 37° C) and bacterial cells plated on LB agar containing 
5 20jig/ml of chloramphenicol. 

Recombinant colonies were grown over-night at 37°C in 4 ml of LB broth containing 20 
(ig/ml of chloramphenicol; bacterial cells were centrifuged and plasmid DNA extracted as 
and analysed by restriction with EcoKL and HindHI. To analyse the ability of the 
recombinant colonies to express the protein, they were inoculated in LB broth containing 
10 20jig/ml of chloramphenicol and let to grown for 16 hours at 37°C. Bacterial cells were 
centrifuged and resuspended in PBS. Expression of the protein was analysed by SDS-PAGE 
and Coomassie Blue staining. 

Expression levels were unexpectedly high from the pSM214 plasmid. 
Oligos used to clone sequences into pSM-214 vectors were as follows: 



AG287 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-TCGCCCGATGTTAAATCGGCGGA 


EcoRI 


Rev 


GCCCAAGCTT-TCAATCCTGCTCTTTTTTGCCG 


HindHI 


A2 287 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-AGCCAAGATATGGCGGCAGT 


EcoRI 


Rev 


GCCCAAGCTT-TCAATCCTGCTCTTTTTTGCCG 


Hindffl 


A3 287 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-TCCGCCGAATCCGCAAATCA 


EcoRI 


Rev 


GCCCAAGCTT-TCAATCCTGCTCTTTTTTGCCG 


HindHI 


A4 287 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-GGAAGGGTTGATTTGGCTAATG 


EcoRI 


Rev 


GCCCAAGCTT-TCAATCCTGCTCTTTTTTGCCG 


HindIII 


Orf46.1 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-TCAGATTTGGCAAACGATTCTT 


EcoRI 


Rev 


GCCCAAGCTT-TTACGTATCATATTTCACGTGCTTC 


HindIII 


AG287-Orf46.1 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-TCGCCCGATGTTAAATCGGCGGA 


EcoRI 


Rev 


GCCCAAGCTT-TTACGTATCATATTTCACGTGCTTC 


HindlH 


919 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-CAAAGCAAGAGCATCCAAACCT 


EcoRI 


Rev 


GCCCAAGCTT-TTACGGGCGGTATTCGGGCT 


HindHI 


961L 
(pSM-214) 


Fwd 


CCGGAATTCATATG-AAACACTTTCCATCC 


EcoRI 


Rev 


GCCCAAGCTT-TTACCACTCGTAATTGAC 


HindIII 


961 
(pSM-214) 


Fwd 


CCGGAATTCATATG-GCCACAAGCGACGAC 


EcoRI 


Rev 


GCCCAAGCTT-TTACCACTCGTAATTGAC 


HindlH 


961c L 
pSM-214 


Fwd 


CCGGAATTCTTATG-AAACACTTTCCATCC 


EcoRI 


Rev 


GCCCAAGCTT-TCAACCCACGTTGTAAGGTTG 


Hindm 


961c 
pSM-214 


Fwd 


CCGGAATTCTTATG-GCCACAAACGACGACG 


EcoRI 


Rev 


GCCCAAGCTT-TCAACCCACGTTGTAAGGTTG 


HindHI 


953 
(pSM-214) 


Fwd 


CCGGAATTCTTATG-GCCACCTACAAAGTGGACGA 


EcoRI 


Rev 


GCCCAAGCTT-TTATTGTTTGGCTGCCTCGATT 


HindHI 
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These sequences were manipulated, cloned and expressed as described for 953L. 

For the pET-24 vector, sequences were cloned and the proteins expressed in pET-24 as 
described below for pET21. pET2 has the same sequence as pET-21, but with the kanamycin 
resistance cassette instead of ampicillin cassette. 

5 Oligonucleotides used to clone sequences into pET-24b vector were: 



AG 287 K 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC § 


Nhel 


Rev 


CCCGCTCGAG-TCAATCCTGCTCTTTTTTGCC * 


Xhol 


A2 287K 


Fwd 


CGCGGATCCGCTAGC-CAAGATATGGCGGCAGT 8 


Nhel 


A3 287 K 


Fwd 


CGCGGATCCGCTAGC-GCCGAATCCGCAAATCA § 


Nhel 


A4 287 K 


Fwd 


CGCGCTAGC-GGAAGGGTTGATTTGGCTAATGG 8 


Nhel 


Orf46.1 K 


Fwd 


GGGAATTCCATATG-GGCATTTCCCGCAAAATATC 


Ndel 


Rev 


CCCGCTCGAG-TTACGTATCATATTTCACGTGC 


Xhol 


Orf46A K 


Fwd 


GGGAATTCCATATG-GGCATTTCCCGCAAAATATC 


Ndel 


Rev 


CCCGCTCGAG-TTATTCTATGCCTTGTGCGGCAT 


Xhol 


961 K 

(MC58) 


Fwd 


CGCGGATCCCATATG-GCCACAAGCGACGACGA 


Ndel 


Rev 


CCCGCTCGAG-TTACCACTCGTAATTGAC 


Xhol 


961a K 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACG 


Ndel 


Rev 


CCCGCTCGAG-TCATTTAGCAATATTATCTTTGTTC 


Xhol 


961b K 


Fwd 


CGCGGATCCCATATG-AAAGCAAACAGTGCCGAC 


Ndel 


Rev 


CCCGCTCGAG-TTACCACTCGTAATTGAC 


Xhol 


961c K 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACG 


Ndel 


Rev 


CCCGCTCGAG-TTAACCCACGTTGTAAGGT 


Xhol 


961cL K 


Fwd 


CGCGGATCCCATATG-ATGAAACACTTTCCATCC 


Ndel 


Rev 


CCCGCTCGAG-TTAACCCACGTTGTAAGGT 


Xhol 


961dK 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACG 


Ndel 


Rev 


CCCGCTCGAG-TCAGTCTGACACTGTTTTATCC 


Xhol 


AG 287- 
919 K 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel 


Rev 


CCCGCTCGAG-TTACGGGCGGTATTCGG 


Xhol 


AG 287- 
Orf46.1 K 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel 


Rev 


CCCGCTCGAG-TTACGTATCATATTTCACGTGC 


Xhol 


AG 287- 
961 K 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel 


Rev 


CCCGCTCGAG-TTACCACTCGTAATTGAC 


Xhol 



* This primer was used as a Reverse primer for all the 287 forms. 
§ Forward primers used in combination with the AG278 K reverse primer. 

Example 6- ORF1 and its leader peptide 

ORF1 from N. meningitidis (serogroup B, strain MC58) is predicted to be an outer membrane 
10 or secreted protein. It has the following sequence: 

1 MKTTDKRTTE THRKAPKTGR IRFSPAYLAI CLSFGILPQA WAGHTYFGIN 
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51 YQYYRDFAEN KGKFAVGAKD IEVYNKKGEL VGKSMTKAPM IDFSWSRNG 

101 VAALVGDQYI VSVAHNGGYN NVDFGAEGRN PDQHRFTYKI VKRNNYKAGT 

151 KGHPYGGDYH MPRLHKFVTD AEPVEMTSYM DGRKYIDQNN YPDRVRI GAG 

201 RQYWRSDEDE PNNRESSYHI ASAYSWLVGG NTFAQNGSGG GTVNLGSEKI 

5 251 KHSPYGFLPT GGSFGDSGSP MFIYDAQKQK WLINGVLQTG NPYIGKSNGF 

301 QLVRKDWFYD EIFAGDTHSV FYEPRQNGKY SFNDDNNGTG KINAKHEHNS 

3 51 LPNRLKTRTV QLFNVSLSET ARE P VYHAAG GWSYRPRLN NGENISFIDE 

401 GKGELILTSN INQGAGGLYF QGDFTVSPEN NETWQGAGVH ISEDSTVTWK 

451 VNGVANDRLS KIGKGTLHVQ AKGENQGSIS VGDGTVILDQ QADDKGKKQA 

10 501 FSEIGLVSGR GTVQLNADNQ FNPDKLYFGF RGGRBDLNGH SLSFHRIQNT 

551 DEGAMIVWHN QDKESTVTIT GNKDIATTGN NNSLDSKKEI AYNGWFGEKD 

601 TTKTNGRLNL VYQPAAEDRT LLLSGGTNLN GNITQTNGKL FFSGRPTPHA 

651 YNHLNDHWSQ KEGIPRGEIV WDNDWINRTF KAENFQIKGG QAVVS RNVAK 

701 VKGDWHLSNH AQAVFGVAPH QSHTICTRSD WTGLTNCVEK TITDDKVIAS 

15 751 LTKTDISGNV DLADHAHLNL TGLATLNGNL SANGDTRYTV SHNATQNGNL 

801 SLVGNAQATF NQATLNGNTS ASGNASFNLS DHAVQNGSLT LSGNAKANVS 

851 HSALNGNVSD ADKAVFHFES SRFTGQISGG KDTALHLKDS EWTLPSGTEL 

901 GNLNLDNATI TLNSAYRHDA AGAQTGSATD APRRRSRRSR RSLLSVTPPT 

951 SVESRFNTLT VNGKLNGQGT FRFMSELFGY RSDKLKLAES SEGTYTLAVN 

20 1001 NTGNEPASLE QLTWEGKDN KPLSENLNFT LQNEHVDAGA WRYQLIRKDG 

1051 EFRLHNPVKE QELSDKLGKA EAKKQAEKDN AQSLDALIAA GRDAVEKTES 

1101 VAEPARQAGG ENVGIMQAEE EKKRVQADKD TALAKQREAE TRPATTAFPR 

1151 ARRARRDLPQ LQPQPQPQPQ RDLISRYANS GLSEFSATLN SVFAVQDELD 

12 01 RVFAEDRRNA VWTSGIRDTK HYRSQDFRAY RQQTDLRQIG MQKNLGSGRV 
25 1251 GILFSHNRTE NTFDDGIGNS ARLAHGAVFG QYGIDRFYIG ISAGAGFSSG 

13 01 SLSDGIGGKI RRRVLHYGIQ ARYRAGFGGF GIEPHIGATR YFVQKADYRY 
1351 ENVNIATPGL AFNRYRAGIK ADYSFKPAQH ISITPYLSDS YTDAASGKVR 
1401 TRVNTAVLAQ DFGKTRSAEW GVNAEIKGFT LSLHAAAAKG PQLEAQHSAG 
1451 IKLGYRW* 

30 The leader peptide is underlined. 

A polymorphic form of ORF1 is disclosed in W099/55873. 

Three expression strategies have been used for ORF1: 

1) ORF1 using a His tag, following W099/24578 (ORFl-His); 

2) ORF1 with its own leader peptide but without any fusion partner ('ORF1L'); and 

35 3) ORF1 with the leader peptide (mkktaiaiavalagfatvaqa) from E.coli OmpA 

('OrflLOmpA'): 

MKKTAI AI AVALAGFATVAQAA SAGHTYFG INYQYYRDFAENKGKFAVGAKD I EVYNKKGELVGKSMTKAPMIDF S V 
VSRNGVAALVGDQY I VSVAHNGGYNNVDFGAEGRNPDQHRFT YK I VKRNNYKAGTKGHPYGGDYHMPRLiHKFVTDAE 
PVEMTSYMDGRKYIDQMKn^^ 

40 IKHSPYGFLPTGGSFGDSGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNGFQLVRKDWFYDEIFAGDTHSVFYEPRQ 
NGKY S FNDDNNGTGK I NAKHEHNS L PNRLKTRT VQLFNVS L S ET ARE PVYHAAGGVN S YR PRLNNGEN I S F I DEGKG 
EL I LT SNI NQGAGGL YFQGDFTVS PENNETWQGAGVH I S EDS TVTWKVNGVANDRIi SKI GKGTLHVQ AKGENQ G S I S 
VGDGTVILDQQADDKGKKQAFSEIGLVSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNGHSLSFHRIQNTDEGAMIV 
NHNQDKESTVTITGNKDIATTGNNNSLDSKKEIAYNGWFGEKDTTKTNGRLNLVYQPA 

45 QTNGKLFFSGRPTPHAYNHLNDHWSQKEGIPRGEIVWDNDWINRTFKAENFQIKGGQAWSROT 

QAVFGVAPHQ SHT I CTRSDWTGLTNCVEKT ITDDKVI ASLTKTD I SGNVDL ADHAHLNLTGLATLNGNLSANGDTRY 
TVSHNATQNGNL S LVGNAQATFNQATLNGNT S AS GNAS FNL S DHAVQNGS LTL SGNAKAJWSH S ALNGNVSL ADKAV 
FHFESSRFTGQISGGKDTALHLKDSEWTLPSGTELGNLNLDNATITLNSAYRHDAAGAQTGSATDAPRRRSRRSRRS 
LLSVTPPTSVESRFNTLTVNGKLNGQGTFRFMSE 

50 NKPL S ENLNFTL QNEHVDAGAWRYQL I RKDGEFRLHN PVKEQEL S DKLGKAE AKKQ AEKDNAQ S LDAL I AAGRDAVE 

KO?ESVAEPARQAGGENVGIMQAEEEKKRVQADKDTALAKQREAETRPATTAFPRARRARRDLPQLQPQPQPQPQRDL 
ISRYANSGLSEFSATLNSVFAVQDELDRVFAEDRRWAWTSGIRDTKHYRSQDFRAYRQQTDLRQIGMQKWLGSGRV 
GILF SHNRTENTFDDG I GNSARLAHGAVFGQYGIDRF YIGI S AGAGF S SGSL SDGIGGK I RRRVLHYG I QARYRAGF 
GGFG I EPH IGATRYFVQKADYRYENVNI AT PGL AFNRYRAGIKADYSFKPAQH I S ITPYLSLS YTDAASGKVRTRVN 

5 5 TAVLAQDFGKTRS AEWGVNAE IKGFTLSLHAAAAKGPQLEAQHS AGIKLGYRW* 
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To make this construct, the clone pET911LOmpA (see below) was digested with the 
Nhel and Xhol restriction enzymes and the fragment corresponding to the vector 
carrying the OmpA leader sequence was purified (pETLOmpA). The ORF1 gene 
coding for the mature protein was amplified using the oligonucleotides ORFl-For 
and ORFl-Rev (including the Nhel and Xhol restriction sites, respectively), digested 



with Nhel and Xhol and ligated to the purified pETOmpA fragment (see Figure 1). 
An additional AS dipeptide was introduced by the Nhel site. 

All three forms of the protein were expressed. The His-tagged protein could be purified and 
was confirmed as surface exposed, and possibly secreted (see Figure 3). The protein was 
10 used to immunise mice, and the resulting sera gave excellent results in the bactericidal assay. 

ORFlLOmpA was purified as total membranes, and was localised in both the inner and 
outer membranes. Unexpectedly, sera raised against ORFlLOmpA show even better ELISA 
and anti-bactericidal properties than those raised against the His-tagged protein. 

ORF1L was purified as outer membranes, where it is localised. 

15 Example 7 - protein 911 and its leader peptide 

Protein 911 from N. meningitidis (serogroup B, strain MC58) has the following sequence: 



20 



1 MKKNILEFWV GLFVLIGAAA VAFLA FRVAG GAAFGGSDKT YAVYADFGD I 

51 GGLKVNAPVK SAGVLVGRVG AIGLDPKSYQ ARVRLDLDGK YQFSSDVSAQ 

101 ILTSGLLGEQ YIGLQQGGDT ENLAAGDT I S VTSSAMVLEN LIGKFMTSFA 

151 EKNADGGNAE KAAE* 



The leader peptide is underlined. 



Three expression strategies have been used for 91 1: 



1) 911 with its own leader peptide but without any fusion partner ('91 1L'); 



2) 91 1 with the leader peptide from Kcoli OmpA ('91 lLOmpA 5 ). 



25 



To make this construct, the entire sequence encoding the OmpA leader peptide was 
included in the 5'- primer as a tail (primer 911LOmpA Forward). A Nhel restriction 
site was inserted between the sequence coding for the OmpA leader peptide and the 
911 gene encoding the predicted mature protein (insertion of one amino acid, a 
serine), to allow the use of this construct to clone different genes downstream the 
OmpA leader peptide sequence. 



30 



3) 911 with the leader peptide (mkyllptaaaglllaaqpama) from Erwinia carotovora 
PelBC911LpelB'). 
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To make this construct, the 5'-end PCR primer was designed downstream from the 
leader sequence and included the Ncol restriction site in order to have the 911 fused 
directly to the PelB leader sequence; the 3'- end primer included the STOP codon. 
The expression vector used was pET22b+ (Novagen), which carries the coding 
sequence for the PelB leader peptide. The Ncol site introduces an additional 
methionine after the PelB sequence. 

All three forms of the protein were expressed. ELISA titres were highest using 91 1L, with 
919LOmpA also giving good results. 



Example 8 - ORF46 

The complete ORF46 protein from N. meningitidis (serogroup B, strain 2996) has the 
following sequence: 

1 LGISRKISLI LSILAVCLPM HAHASDLAND SFIRQVLDRQ HFE PDGKYHL 

51 FGSRGELAER SGHIGLGKIQ SHQLGNLMIQ QAAIKGNIGY IVRFSDHGHE 

101 VHSPFDNHAS HSDSDEAGSP VDGFSLYRIH WDGYEHHPAD GYDGPQGGGY 

151 PAPKGARDIY S YD I KGVAQN IRLNLTDNRS TGQRLADRFH NAGSMLTQGV 

201 GDGFKRATRY SPELDRSGNA AEAFNGTADI VKNIIGAAGE IVGAGDAVQG 

251 ISEGSNIAVM HGLGLLSTEN KMARINDLAD MAQLKDYAAA AIRDWAVQNP 

3 01 NAAQGIEAVS NIFMAAIPIK GIGAVRGKYG LGGITAHPIK RSQMGAIALP 

351 KGKSAVSDNF ADAAYAKYPS PYHSRNIRSN LEQRYGKENI TSSTVPPSNG 

401 KNVKLADQRH PKTGVPFDGK GFPNFEKHVK YDTKLDIQEL SGGGIPKAKP 

451 VSDAKPRWEV DRKLNKLTTR EQVEKNVQEI RNGNKNSNFS QHAQLEREIN 

501 KLKSADEINF ADGMGKFTDS MNDKAFSRLV KSVKENGFTN PWEYVEING 

551 KAYIVRGNWR VFAAEYLGRI HELKFKKVDF PVPNTSWKNP TDVLNESGNV 

601 KRPRYRSK* 

The leader peptide is underlined. 

The sequences of ORF46 from other strains can be found in WO00/66741. 

Three expression strategies have been used for ORF46: 

1) ORF46 with its own leader peptide but without any fusion partner ('ORF46-2L'); 

2) ORF46 without its leader peptide and without any fusion partner ( fi ORF46-2') ? with 
the leader peptide omitted by designing the 5 -end amplification primer downstream 
from the predicted leader sequence: 

1 SDLANDSFIR QVLDRQHFEP DGKYHLFGSR GELAERSGHI GLGKIQSHQL 

51 GNLMIQQAAI KGNIGYIVRF SDHGHEVHSP FDNHASHSDS DEAGSPVDGF 

101 SLYRIHWDGY EHHPADGYDG PQGGGYPAPK GARDIYSYDI KGVAQNIRLN 

151 LTDNRSTGQR LADRFHNAGS MLTQGVGDGF KRATRYS PEL DRSGNAAEAF 

201 NGTADIVKNI IGAAGE I VGA GDAVQGI SEG SNIAVMHGLG LLSTENKMAR 

251 INDLADMAQL KDYAAAAIRD WAVQNPNAAQ GIEAVSNIFM AAIPIKGIGA 

301 VRGKYGLGGI TAHPIKRSQM GAIALPKGKS AVS D3SIF ADAA YAKYPSPYHS 

351 RNIRSNLEQR YGKENITSST VPPSNGKNVK LADQRHPKTG VPFDGKGFPN 

401 FEKHVKYDTK LDIQELSGGG IPKAKPVSDA KPRWEVDRKL NKLTTREQVE 

451 KNVQEIRNGN KNSNFSQHAQ LEREINKLKS ADEINFADGM GKFTDSMNDK 

501 AFSRLVKSVK ENGFTNPWE YVEINGKAYI VRGNNRVFAA EYLGRIHELK 

551 FKKVDFPVPN TSWKNPTDVL NESGNVKRPR YRSK* 
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3) ORF46 as a trancated protein, consisting of the first 433 amino acids ('ORF46.1L'), 
constructed by designing PGR primers to amplify a partial sequence corresponding 
to aa 1-433. 

5 A STOP codon was included in the 3 '-end primer sequences. 

ORF46-2L is expressed at a very low level to E.coli. Removal of its leader peptide 
(ORF46-2) does not solve this problem. The truncated ORF46.1L form (first 433 amino 
acids, which are well conserved between serogroups and species), however, is 
well-expressed and gives excellent results in ELISA test and in the bactericidal assay. 

10 ORF46.1 has also been used as the basis of hybrid proteins. It has been fused with 287, 919, 
and ORF1. The hybrid proteins were generally insoluble, but gave some good ELISA and 
bactericidal results (against the homologous 2996 strain): 



Protein 


ELISA 


Bactericidal Ab 


Orfl-Orf46.1-His 


850 


256 


919-Orf46.1-His 


12900 


512 


919-287-Orf46-His 


n.d. 


n.d. 


Orf46.1-287His 


150 


8192 


Orf46.1-919His 


2800 


2048 


Orf46.1 -287-9 19His 


3200 


16384 



For comparison, 'triple' hybrids of ORF46.1, 287 (either as a GST fusion, or in AG287 
form) and 919 were constructed and tested against various strains (including the homologous 
15 2996 strain) versus a simple mixture of the three antigens. FCA was used as adjuvant: 





2996 


BZ232 


MC58 


NGH38 


F6124 


BZ133 


Mixture 


8192 


256 


512 


1024 


>2048 


>2048 ' 


ORF46.1-287-919his 


16384 


256 


4096 


8192 


8192 


8192 


AG287-919-ORF46.1his 


8192 


64 


4096 


8192 


8192 


16384 


AG287-ORF46.1-919his 


4096 


128 


256 


8192 


512 


1024 



Again, the hybrids show equivalent or superior immunological activity. 



Hybrids of two proteins (strain 2996) were compared to the individual proteins against 
various heterologous strains: 
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1000 


MC58 


F6124 (MenA) 


ORF46.1-His 


<4 


4096 


<4 


ORFl-His 


8 


256 


128 


ORF1— ORF46.1-His 


1024 


512 


1024 



Again, the hybrid shows equivalent or superior immunological activity. 



Example 9 - protein 961 

The complete 961 protein from N. meningitidis (serogroup B, strain MC58) has the following 
sequence: 

5 1 MSMKHFPAKV LTTAILATFC SGALAA TSDD DVKKAAT VA I VAAYNNGQEI 

51 NGFKAGETIY DIGEDGTITQ KDATAADVEA DDFKGLGLKK WTNLTKTW 
101 ENKQNVDAKV KAAESEIEKL TTKLADTDAA LADTDAALDE TTNALNKLGE 
151 NITTFAEETK TNIVKIDEKL EAVADTVDKH AEAFNDIADS LDETNTKADE 

2 01 AVKTANEAKQ TAEETKQNVD AKVKAAETAA GKAEAAAGTA NTAADKAEAV 
10 251 AAKVTDIKAD IATNKADIAK NSARIDSLDK NVANLRKETR QGLAEQAALS 

3 01 GLFQPYNVGR FNVTAAVGGY KSESAVAIGT GFRFTENFAA KAG VAVGT S S 
351 GSSAAYHVGV NYEW* 

The leader peptide is underlined. 

15 Three approaches to 961 expression were used: 

1) 961 using a GST fusion, following WO99/57280 ('GST961'); 

2) 961 with its own leader peptide but without any fusion partner ('961L'); and 

3) 961 without its leader peptide and without any fusion partner ( 6 961 untagged '), with the 
leader peptide omitted by designing the 5'-end PGR primer downstream from the 

20 predicted leader sequence. 

All three forms of the protein were expressed. The GST-fusion protein could be purified and 
antibodies against it confirmed that 961 is surface exposed (Figure 4). The protein was used 
to immunise mice, and the resulting sera gave excellent results in the bactericidal assay. 
96 1L could also be purified and gave very high ELISA titres. 

25 Protein 961 appears to be phase variable. Furthermore, it is not found in all strains of 
N. meningitidis. 

Example 10- protein 287 

Protein 287 from N .meningitidis (serogroup B, strain 2996) has the following sequence: 

1 MFERSVIAMA CIFALSA CGG GGGGSPDVKS ADTLSKPAAP WAEKETEVK 

30 51 EDAPQAGSQG QGAPSTQGSQ DMAAVSAENT GNGGAATTDK PKNEDEGPQN 

101 DMPQNSAESA HQTGNNQPAD SSDSAPASNP APANGGSNFG RVDLANGVLI 

151 DGPSQNITLT HCKGDSCNGD NLLDEEAPSK SEFENLNESE RIEKYKKDGK 
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201 SDKFTNLVAT AVQANGTNKY VIIYKDKSAS SSSARFRRSA RSRRSLPAEM 

251 PLIPVNQADT LIVDGEAVSLr TGHSGNIFAP EGNYRYLTYG AEKDPGGSYA 

3 01 LRVQGEPAKG EMLAGTAVYN GEVLHFHTEN GRPYPTRGRF AAKVDFGSKS 

351 VBGIIDSGDD LHMGTQKFKA AIDGNGFKGT WTENGGGDVS GRFYGPAGEE 

401 VAGKYSYRPT DAEKGGFGVF AGKKEQD* 



The leader peptide is shown underlined. 

The sequences of 287 from other strains can be found in Figures 5 and 15 of WO00/66741. 
Example 9 of WO99/57280 discloses the expression of 287 as a GST-fusion in E.coli. 
10 A number of further approaches to expressing 287 in E.coli have been used, including: 



25 All these proteins could be expressed and purified. 

'287L' and '287LOrf4* were confirmed as lipoproteins. 

As shown in Figure 2, < 287LOrf4' was constructed by digesting 919LOrf4 with Nhel and 
Xhol. The entire ORF4 leader peptide was restored by the addition of a DNA sequence 
coding for the missing amino acids, as a tail, in the 5' -end primer (287LOrf4 for), fused to 
30 287 coding sequence. The 287 gene coding for the mature protein was amplified using the 
oligonucleotides 287LOrf4 For and Rev (including the Nhel and Xhol sites, respectively), 
digested with Nhel and Xhol and ligated to the purified pETOrf4 fragment. 



1) 
2) 
3) 
4) 



287 as a His-tagged fusion 0287-His'); 

287 with its own leader peptide but without any fusion partner ('287L'); 

287 with the ORF4 leader peptide and without any fusion partner ( 6 287LOrf4'); and 

287 without its leader peptide and without any fusion partner (<287 untagged '): 



20 



1 CGGGGGGSPD VKSADTLSKP AAPWAEKET EVKEDAPQAG SQGQGAPSTQ 

51 GSQDMAAVSA ENTGNGGAAT TDKPKNEDEG PQNDMPQNSA ESANQTGNNQ 

101 PADSSDSAPA SNPAPANGGS NFGRVDLANG VLIDGPSQNI TLTHCKGDSC 

151 NGDNLLDEEA PSKSEFENLN ESERIEKYKK DGKSDKFTNL VATAVQANGT 

201 NKYVIIYKDK SASSSSARFR RSARSRRSLP AEMPLIPVNQ ADTLIVDGEA 

251 VSLTGHSGNI FAPEGNYRYL TYGAEKLPGG SYALRVQGEP AKGEMLAGTA 

3 01 VYNGEVLHFH TENGRPYPTR GRFAAKVDFG SKSVDGIIDS GDDLHMGTQK 

351 FKAAIDGNGF KGTWTENGGG DVSGRFYGPA GEEVAGKYSY RPTDAEKGGF 

401 GVFAGKKEQD * 



35 



Example 11 - further non-fusion proteins with/without native leader peptides 

A similar approach was adopted for E.coli expression of further proteins from W099/24578, 

W099/36544 and WO99/57280. 
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The following were expressed without a fusion partner: 008, 105, 117-1, 121-1, 122-1, 128- 
1, 148, 216, 243, 308, 593, 652, 726, 982, and Orf 143-1. Protein 117-1 was confirmed as 
surface-exposed by FACS and gave high ELISA titres. 

The following were expressed with the native leader peptide but without a fusion partner: 
5 111, 149, 206, 225-1, 235, 247-1, 274, 283, 286, 292, 401, 406, 502-1, 503, 519-1, 525-1, 
552, 556, 557, 570, 576-1, 580, 583, 664, 759, 907, 913, 920-1, 926, 936-1, 953, 961, 983, 
989, Orf4, Orf7-l, Orf9-l, Orf23, Orf25, Orf37, Orf38, Orf40, Orf40.1, Orf40.2, Orf72-l, 
OrF76-l, Orf85-2, Orf91, Orf97-l, Orfll9, Orfl43.1. These proteins are given the suffix 'L\ 

His-tagged protein 760 was expressed with and without its leader peptide. The deletion of 
10 the signal peptide greatly increased expression levels. The protein could be purified most 
easily using 2M urea for solubilisation. 

His-tagged protein 264 was well-expressed using its own signal peptide, and the 30kDa 
protein gave positive Western blot results. 

All proteins were successfully expressed. 

15 The localisation of 593, 121-1, 128-1, 593, 726, and 982 in the cytoplasm was confirmed. 

The localisation of 920-1L, 953L, ORF9-1L, ORF85-2L, ORF97-1L, 570L, 580L and 664L 
in the periplasm was confirmed. 

The localisation of ORF40L in the outer membrane, and 008 and 519-1L in the inner 
membrane was confirmed. ORF25L, ORF4L, 406L, 576- 1L were all confirmed as being 
20 localised in the membrane. 

Protein 206 was found not to be a lipoprotein. 

ORF25 and ORF40 expressed with their native leader peptides but without fusion partners, 
and protein 593 expressed without its native leader peptide and without a fusion partner, 
raised good anti-bactericidal sera. Surprisingly, the forms of ORF25 and ORF40 expressed 
25 without fusion partners and using their own leader peptides (i.e. 'ORF25L' and 'ORF40L') 
give better results in the bactericidal assay than the fusion proteins. 

Proteins 920L and 953L were subjected to N-terminal sequencing, giving hrvwetah and 
at ykvde yhanarf af , respectively. This sequencing confirms that the predicted leader 
peptides were cleaved and, when combined with the periplasmic location, confirms that the 



WO 01/64922 



PCT/IB01/00452 



-30- 

proteins are correctly processed and localised by E.coli when expressed from their native 
leader peptides. 

The N-terminal sequence of protein 519.1L localised in the inner membrane was meffiilda, 
indicating that the leader sequence is not cleaved. It may therefore function as both an 
5 uncleaved leader sequence and a transmembrane anchor in a manner similar to the leader 
peptide of PBP1 from Kgonorrhoeae [Ropp & Nicholas (1997) J. Bact 179:2783-2787.]. 
Indeed the N-terminal region exhibits strong hydrophobic character and is predicted by the 
Tmpred. program to be transmembrane. 

Example 12 - lipoproteins 

10 The incorporation of palmitate in recombinant lipoproteins was demonstrated by the method 
of Kraft et at [J. Bact. (1998) 180:3441-3447.]. Single colonies harbouring the plasmid of 
interest were grown overnight at 37°C in 20 ml of LB/Amp (100pg/ml) liquid culture. The 
culture was diluted to an OD 55 o of 0.1 in 5.0 ml of fresh medium LB/Amp medium 
containing 5 jaC/ml [ 3 H] palmitate (Amersham). When the OD550 of the culture reached 0.4- 

15 0.8, recombinant lipoprotein was induced for 1 hour with IPTG (final concentration 1.0 
mM). Bacteria were harvested by centrifugation in a bench top centrifuge at 2700g for 15 
min and washed twice with 1.0 ml cold PBS. Cells were resuspended in 120pi of 20 mM 
Tris-HCl (pH 8.0), 1 mM EDTA, 1.0% w/v SDS and lysed by boiling for 10 min. After 
centrifugation at 13000g for 10 min the supernatant was collected and proteins precipitated 

20 by the addition of 1.2 ml cold acetone and left for 1 hour at -20 °C. Protein was pelleted by 
centrifugation at 13000g for 10 min and resuspended in 20-50[il (calculated to standardise 
loading with respect to the final O.D of the culture) of 1.0% w/v SDS. An aliquot of 15 jal 
was boiled with 5\xl of SDS-PAGE sample buffer and analysed by SDS-PAGE. After 
electrophoresis gels were fixed for 1 hour in 10% v/v acetic acid and soaked for 30 minutes 

25 in Amplify solution (Amersham). The gel was vacuum-dried under heat and exposed to 
Hyperfilm (Kodak) overnight -80 °C. 

Incorporation of the [ 3 H] palmitate label, confirming lipidation, was found for the following 
proteins: Orf4L, Orf25L, 287L, 287LOrf4, 406.L, 576L, 926L, 919L and 919LOrf4. 

Example 13 - domains in 287 

30 Based on homology of different regions of 287 to proteins that belong to different functional 
classes, it was split into three 'domains', as shown in Figure 5. The second domain shows 
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homology to IgA proteases, and the third domain shows homology to transferrin-binding 
proteins. 

Each of the three 'domains' shows a different degree of sequence conservation between 
N. meningitidis strains - domain C is 98% identical, domain A is 83% identical, whilst 
5 domain B is only 71% identical. Note that protein 287 in strain MC58 is 61 amino acids 
longer than that of strain 2996. An alignment of the two sequences is shown in Figure 7, and 
alignments for various strains are disclosed in WO00/66741 (see Figures 5 and 15 therein). 

The three domains were expressed individually as C-terminal His-tagged proteins. This was 
done for the MC58 and 2996 strains, using the following constructs: 

10 287a-MC58 (aa 1-202), 287b-MC58 (aa 203-288), 287c-MC58 (aa 311-488). 

287a-2996 (aa 1-139), 287b-2996 (aa 140-225), 287c-2996 (aa 250-427). 

To make these constructs, the stop codon sequence was omitted in the 3 '-end primer 
sequence. The 5' primers included the Nhel restriction site, and the 3' primers included a 
Xhol as a tail, in order to direct the cloning of each amplified fragment into the expression 
15 vector pET21b+ using Ndel-Xhol, Nhel-Xhol or Ndel-HindBl restriction sites. 

All six constructs could be expressed, but 287b-MC8 required denaturation and refolding for 
solubilisation. 

Deletion of domain A is described below ( ; A4 287-His'). 

Immunological data (serum bactericidal assay) were also obtained using the various domains 
20 from strain 2996, against the homologous and heterologous MenB strains, as well as MenA 
(F6124 strain) and MenC (BZ133 strain): 
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NGH38 
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287-His 


32000 


16 
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16000 


287(B)-His 
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16 




287(C)-His 


256 




32 


512 


32 


2048 


>2048 


287(B-C)-His 


64000 


128 


4096 


64000 


1024 


64000 


32000 



Using the domains of strain MC58, the following results were obtained: . 



WO 01/64922 



PCT/IB01/00452 



-32- 





MC58 


2996 


BZ232 


NGH38 


394/98 


MenA 
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287-His 
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128 


128 
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287(C)-His 




16 




1024 




512 




287(B-C)-His 


16000 


64000 


128 


64000 


512 


64000 


>8000 



Example 14 - deletions in 287 

As well as expressing individual domains, 287 was also expressed (as a C-terminal 
His-tagged protein) by making progressive deletions within the first domain. These 

Four deletion mutants of protein 287 from strain 2996 were used (Figure 6): 

1) '287-His', consisting of amino acids 18-427 (i.e. leader peptide deleted); 

2) 'Al 287-His', consisting of amino acids 26-427; 

3) 'A2 287-His', consisting of amino acids 70-427; 

4) 'A3 287-His', consisting of amino acids 107-427; and 

5) 'A4 287-His', consisting of amino acids 140-427 (=287-bc). 

The <A4' protein was also made for strain MC58 (' A4 287MC58-His'; aa 203-488). 

The constructs were made in the same way as 287a/b/c, as described above. 

All six constructs could be expressed and protein could be purified. Expression of 287-His 
was, however, quite poor. 

Expression was also high when the C-terminal His-tags were omitted. 

Immunological data (serum bactericidal assay) were also obtained using the deletion 
mutants, against the homologous (2996) and heterologous MenB strains, as well as MenA 
(F6124 strain) and MenC (BZ133 strain): 
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BZ232 
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MenA 


MenC 


287-his 


32000 
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4096 
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8000 
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Al 287-His 


16000 


128 


4096 


4096 


1024 


8000 


16000 


A2 287-His 


16000 


128 


4096 
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16000 
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A3 287-His 


16000 


128 


4096 


>2048 


512 


16000 
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A4 287-His 


64000 


128 


4096 


64000 


1024 


64000 


32000 



The same high activity for the A4 deletion was seen using the sequence from strain MC58. 
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As well as showing superior expression characteristics, therefore, the mutants are 
immunologically equivalent or superior. 

Example 15 - poly-glycine deletions 

The 'Al 287-His' construct of the previous example differs from 287-His and from 
5 <287 untagged ' only by a short N-terminal deletion (GGGGGGS). Using an expression vector 
which replaces the deleted serine with a codon present in the Nhe cloning site, however, this 
amounts to a deletion only of (Gly>6. Thus, the deletion of this (Gly>6 sequence has been 
shown to have a dramatic effect on protein expression. 

The protein lacking the N-terminal amino acids up to GGGGGG is called 'AG 287'. In strain 
10 MC58, its sequence (leader peptide underlined) is: 

AG2 87 

1 MFKRSVIAMA CIFALSA CGG GGGGSPDVKS ADTLSKPAAP WSEKETEAK 

51 EDAPQAGSQG QGAPSAQGSQ DMAAVSEENT GNGGAVTADN PKNEDEVAQN 

101 DMPQNAAGTD SSTPNHTPDP NMLAGNMENQ ATDAGESSQP ANQ PDMANAA 

15 151 DGMQGDDPSA GGQNAGNTAA QGANQAGNNQ AAGSSDPIPA SNPAPANGGS 

2 01 NFGRVDLANG VLIDGPSQNI TLTHCKGDSC SGNNFLDEEV QLKSEFEKLS 
251 DADK I SNYKK DGKNDKFVGL VADSVQMKGI NQYIIFYKPK PTSFARFRRS 

3 01 ARSRRSLPAE MPLIPWQAD TLIVDGEAVS LTGHSGNIFA PEGNYRYLTY 
3 51 GAEKLPGGSY AIjRVQGEPAK GEMLAGAAVY NGEVLHFHTE NGRPYPTRGR 

20 401 FAAKVDFGSK SVDGIIDSGD DLHMGTQKFK AA I DGNGFKG TWTEMGSGDV 

451 SGKFYGPAGE EVAGKYSYRP TDAEKGGFGV FAGKKEQD* 

AG287, with or without His-tag ('AG287-His' and 'AG287K', respectively), are expressed at 
very good levels in comparison with the '287-His' or c 287 unta ^ ed >. 

25 On the basis of gene variability data, variants of AG287-His were expressed in E.coli from a 
number of MenB strains, in particular from strains 2996, MC58, 1000, and BZ232. The 
results were also good. 

It was hypothesised that poly-Gly deletion might be a general strategy to improve 
expression. Other MenB lipoproteins containing similar (Gly) n motifs (near the N-terminus, 
30 downstream of a cysteine) were therefore identified, namely Tbp2 (NMB0460), 741 (NMB 
1870) and 983 (NMB 1969): 

TBP2 p+ AGTbp2 

1 M3SFNPLVNQAA MVL PVFLL 5 A CLGGGGSFDL DSVDTEAPRP APKYQDVFSE 

51 KPQAQKDQGG YGFAMRLKRR NWYPQAKEDE VKLDESDWEA TGLPDEPKEL 

35 101 PKRQKSVIEK VETDSDNNIY SSPYLKPSNH QNGNTGNGIN QPKNQAKDYE 

151 NFKYVYSGWF YKHAKREFNIj KVEPKSAKNG DDGYIFYHGK EPSRQLPASG 

201 K I T YKGVWHF ATDTKKGQKF REIIQPSKSQ GDRYSGFSGD DGEEYSNKNK 

251 STLTDGQEGY GFTSNLEVDF HNKKLTGKLI RNNANTDNNQ ATTTQYYSLE 

301 AQVTGNRFNG KATATDKPQQ NSETKEHPFV SDSSShSGGF FGPQGEELGF 

40 351 RFLSDDQKVA WGSAKTKDK PANGNTAAAS GGTDAAASNG AAGTSSENGK 

401 LTTVLDAVEL KLGDKEVQKL DNF SNAAQLV VDGIMIPLLP EASE SGNNQ A 

451 NQGTNGGTAF TRKF DHT PES DKKDAQAGTQ TNGAQTASNT AGDTNGKTKT 
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5 01 YEVEVCCSNL NYLKYGMLTR KNSKSAMQAG ESSSQADAKT EQVEQSMFLQ 

551 GERTDEKEIP SEQNIVYRGS WYGYIANDKS TSWSGNASNA TSGNRAEFTV 

601 JJFADKKITGT LTADNRQEAT FTIDGNIKDN GFEGTAKTAE SGFDLDQSNT 

651 TRTPKAYITD AKVQGGFYGP KAEELGGWFA YPGDKQTKNA TNASGNSSAT 

7 01 WFGAKRQQP VR* 



741 AG741 

1 TORTAFCCLS LTTALILTAC SSGGGGVAAD IGAGLADALT APLDBKDKGL, 

51 QSLTLDQSVR KNEKLKbAAQ GAEKTYGNGD SLNTGKLKND KVSRFDFIRQ 

101 IEVDGQLITL ESGEFQVYKQ SHSALTAFQT EQIQDSEHSG KMVAKRQFRI 

151 GDIAGEHTSF DKLPEGGRAT YRGTAFGSDD AGGKLTYTID FAAKQGNGKI 

201 EHLKSPELNV DLAAADIKPD GKRHAVISGS VLYNQAEKGS YSLGIFGGKA 

251 QEVAGSAEVK TVNGIRHIGL AAKQ* 



983 AG983 

1 MRTTPTFPTK TFKPTAMALA VATTLSAC LG GGGGGTSAPD FNAGGTGIGS 

51 NSRATTAKSA AVSYAGIKNE MCKDRSMLCA GRDDVAVTDR DAKINAPPPN 

101 LHTGDFPNPN DAYKNLINLK PAIEAGYTGR GVEVGIVDTG ESVGSISFPE 

151 LYGRKEHGYN ENYKNYTAYM RKEAPEDGGG KDIEASFDDE AVIETEAKPT 

2 01 DIRHVKEIGH IDLVSHIIGG RSVDGRPAGG XAPDATLHIM NTNDETKNEM 

251 MVAAIRNAWV KLGERGVRIV NNSFGTTSRA GTADLFQIAN SEEQYRQALL 

301 DYSGGDKTDE GIRLMQQSDY GHDSYHIRNK NMLFXF STGN DAQAQPNTY& 

351 LLPFYEKDAQ KGIITVAGVD RSGEKFKREM YGEPGTEPLE YGSNHCGITA 

401 MWCLSAPYEA SVRFTRTNPI QIAGTSFSAP IVTGTAALLIi QKYPWMSNDN 

451 LRTTLLTTAQ DIGAVGVDSK FGWGLLDAGK AMNGPASFPF GDFTADTKGT 

501 SDIAYSFRND ISGTGGLIKK GGSQLQLHGN NTYTGKTIIE GGSLVLYGNN 

551 KSDMRVETKG ALIYNGAASG GSLNSDGIVY LADTDQSGAN ETVHIKGSLQ 

6 01 LDGKGTLYTR LGKLLKVDGT AIIGGKLYMS ARGKGAGYLN STGRRVPFLS 

651 AAKIGQDYSF FTNIETDGGL LASLDSVEKT AGSEGDTLSY YVRRGNAART 

701 ASAAAHSAPA GLKHAVEQGG SNLEWLMVEL DASESSATPE TVETAAADRT 

751 DMPGIRPYGA TFRAAAAVQH ANAADG VR I F NSLAATVYAD STAAHADMQG 

8 01 RRLKAVSDGL DHNGTGLRVI AQTQQDGGTW EQGGVEGKMR GSTQTVGIAA 

851 KTGENTTAAA TLGMGRSTWS ENSANAKTDS ISLFAGIRHD AGDIGYLKGL 

901 FSYGRYKNSI SRSTGADEHA EG SVNGTLMQ LGALGGVWP FAATGDLTVE 

951 GGL.RYDLLKQ DAFAEKGSAL GWSGNSLTEG T 3jVGL» AGLKL SQPLSDKAVL 

1001 FATAGVERDL NGRDYTVTGG FTGATAATGK TGARNMPHTR LVAGLGADVE 

1051 FGNGWNGLAR YSYAGSKQYG NHSGRVGVGY RF* 

Tbp2 and 741 genes were from strain MC58; 983 and 287 genes were from strain 2996. 
These were cloned in pET vector and expressed in E.coli without the sequence coding for 
their leader peptides or as "AG forms", both fused to a C-terminal His-tag. In each case, the 
same effect was seen - expression was good in the clones carrying the deletion of the 
poly-glycine stretch, and poor or absent if the glycines were present in the expressed protein: 
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SDS-PAGE of the proteins is shown in Figure 13. 



AG287 and hybrids 

AG287 proteins were made and purified for strains MC58, 1000 and BZ232. Each of these 
gave high ELISA titres and also serum bactericidal titres of >8192. AG287K, expressed from 
pET-24b, gave excellent titres in ELISA and the serum bactericidal assay. 
AG287-ORF46.1K may also be expressed in pET~24b. 

AG287 was also fused directly in-frame upstream of 919, 953, 961 (sequences shown below) 
and ORF46.1: 



AG287-919 

1 ATGGCTAGCC 

51 TCCTGTTGTT 

101 CAGGTTCTCA 

151 GCGGCAGTTT 

201 CAAACCCAAA 

251 CCGCCGAATC 

301 GATTCCGCCC 

3 51 TGGAAGGGTT 

401 ATATAACGTT 

451 TTGGATGAAG 

501 TGAACGAATT 

551 ATTTGGTTGC 

601 ATTTATAAAG 

651 TGCACGGTCG 

701 ATCAGGCGGA 

751 CATTCCGGCA 

801 CGGGGCGGAA 

851 AACCGGCAAA 

901 GTGCTGCATT 

951 GTTTGCCGCA 

1001 ACAGCGGCGA 



CCGATGTTAA ATCGGCGGAC 
GCTGAAAAAG AGACAGAGGT 
AGGACAGGGC GCGCCATCCA 
CGGCAGAAAA TACAGGCAAT 
AATGAAGACG AGGGACCGCA 
CGCAAATCAA ACAGGGAACA 
CCGCGTCAAA CCCTGCACCT 
GATTTGGCTA ATGGCGTTTT 
GACCCACTGT AAAGGCGATT 
AAGCACCGTC AAAATCAGAA 
GAGAAATATA AGAAAGATGG 
GACAGCAGTT CAAGCTAATG 
ACAAGTCCGC TTCATCTTCA 
AGGAGGTCGC TTCCTGCCGA 
TACGCTGATT GTCGATGGGG 
ATATCTTCGC GCCCGAAGGG 
AAATTGCCCG GCGGATCGTA 
AGGCGAAATG CTTGCTGGCA 
TTCATACGGA AAACGGCCGT 
AAAGTCGATT TCGGCAGCAA 
TGATTTGCAT ATGGGTACGC 



ACGCTGTCAA AACCGGCCGC 
AAAAGAAGAT GCGCCACAGG 
CACAAGGCAG CCAAGATATG 
GGCGGTGCGG CAACAACGGA 
AAATGATATG CCGCAAAATT 
ACCAACCCGC CGATTCTTCA 
GCGAATGGCG GTAGCAATTT 
GATTGATGGG C C GTCGC AAA 
CTTGTAATGG TGATAATTTA 
TTTGAAAATT TAAATGAGTC 
GAAAAGCGAT AAATTTACTA 
GAACTAACAA ATATGTCATC 
TCTGCGCGAT TCAGGCGTTC 
GATGCCGCTA ATCCCCGTCA 
AAGCGGTCAG CCTGACGGGG 
AATTACCGGT ATCTGACTTA 
TGCCCTCCGT GTGCAAGGCG 
CGGCCGTGTA CAACGGCGAA 
CCGTACCCGA OTAGAGGCAG 
ATCTGTGGAC GGCATTATCG 
AAAAATTCAA AGCCGCCATC 
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10 



15 



20 



25 



30 



1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 



GATGGAAACG 
TTCCGGAAGG 
GCTATCGCCC 
AAAAAAGAGC 
CCAAACCTTT 
CGGTCGGCAT 
GT C TAT AC C G 
TTTCGCCAAA 
ACCGCCAAGG 
CATTCCTTTC 
GGTTGCAGGC 
CGGTGCTGAA 
TACGGTATTC 
GAGCGGAAAA 
CAATCGACAA 
ATCACCGCGC 
CCTCCCCTAC 
AAGCCCCGAT 
CACATCCAAG 
CATC GGC TAT 
ATATGGCGGA 
ATCAAAGCCT 
TCAAAACCCC 
ACGGTCCCGT 
GCAGTCGACC 
CGCCCATCCG 
ATACCGGCAG 
TACGGCGACG 
CGTCTGGCAG 
TCGAG 



GCTTTAAGGG 
TTTTACGGCC 
GACAGATGCG 
AGGATGGATC 
CCGCAACCCG 
CCCCGACCCC 
TTGT AC CGC A 
AGCCTGCAAT 
CTGGCAGGAT 
AGGCAAAACA 
AACGGAAGCC 
GGGCGACGAC 
CCGACGATTT 
GCCCTTGTCC 
TACCGGCGGC 
GCACAACGGC 
CACACGCGCA 
ACTCGGTTAC 
GCTCGGGCCG 
GCCGACAAAA 
CAAAGGCTAC 
ATATGCGGCA 
AGCTATATCT 
CGGCGCACTG 
GGC AC TAC AT 
GTTACCCGCA 
C GC GATT AAA 
AAGCCGGCGA 
CTCCTACCCA 



GACTTGGACG 
CGGCCGGCGA 
GAAAAGGGCG 
CGGAGGAGGA 
AC AC AT C C GT 
GCCGGAACGA 
CCTGTCCCTG 
CCTTCCGCCT 
GTGTGCGCCC 
GTTTTTTGAA 
TTGCCGGTAC 
AGGCGGACGG 
TATCTCCGTC 
GCATCAGGCA 
ACACATACCG 
AATCAAAGGC 
ACCAAATCAA 
GCCGAAGACC 
TCTGAAAACC 
ACGAACATCC 
CTCAAGCTCG 
AAATCCGCAA 
TTTTCCGCGA 
GGCACGCCGT 
TACCTTGGGC 
AAGCCCTCAA 
GGCGCGGTGC 
AC TT GC C GGC 
AC GGTATGAA 



GAAAATGGCG 
GGAAGTGGCG 
G ATT C GGC GT 
GGATGCCAAA 
CATCAACGGC 
CGGTCGGCGG 
CCCCACTGGG 
CGGCTGCGCC 
AAGCCTTTCA 
CGCTATTTCA 
GGTTACCGGC 
CACAAGCCCG 
CCCCTGCCTG 
GACGGGAAAA 
CCGACCTCTC 
AGGT TTGAAG 
CGGCGGCGCG 
CCGTCGAACT 
CCGTCCGGCA 
CTACGTTTCC 
GGCAGACCTC 
CGCCTCGCCG 
GCTTGCCGGA 
TGATGGGGGA 
GCGCCCTTAT 
CCGCCTGATT 
GCGTGGATTA 
AAACAGAAAA 
GCCCGAATAC 



GCGGGGATGT 
GGAAAATACA 
GTTTGCCGGC 
GCAAGAGCAT 
CCGGACCGGC 
CGGCGGGGCC 
CGGCGCAGGA 
AATTTGAAAA 
AACCCCCGTC 
CGCCGTGGCA 
T ATT AC GAGC 
CTTCCCGATT 
CCGGTTTGCG 
AACAGCGGCA 
CCGATTCCCC 
GAAGCCGCTT 
CTTGACGGCA 
TTTTTTTATG 
AATACATCCG 
ATCGGACGCT 
GATGCAGGGC 
AAGTTTTGGG 
AGCAGCAATG 
ATATGCCGGC 
TTGTCGCCAC 
AT GGC GC AGG 
TTTTTGGGGA 
CCACGGGTTA 
CGCCCGTAAC 



35 



40 



45 



50 



55 



60 



65 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 



LIAS PDVK SAD 
AAVSAENTGN 
DSAPASNPAP 
LDEEAPSKSE 
IYKDKSASSS 
HSGNIFAPEG 
VLHFHTENGR 
DGNGFKGTWT 
KKEQDGSGGG 
VYTWPHLSL 
HSFQAKQFFE 
YGIPDDFISV 
ITARTTAIKG 
HIQGSGRLiKT 
I KAYMRQN PQ 
AVDRHYITLG 
YGDEAGELAG 



TLSKPAAPW 
GGAATTDKPK 
ANGGSNFGRV 
FENLNESERI 
SARFRRSARS 
NYRYLTYGAE 
PYPTRGRFAA 
ENGGGDVSGR 
GCQSKSIQTF 
PHWAAQDFAK 
RYF T PWQVAG 
PLPAGLRSGK 
RFEGSRFIiPY 
PSGKYIRIGY 
RLAEVLGQNP 
APLFVATAHP 
KQKTTGYVWQ 



AG2 87-953 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 



ATGGCTAGCC 
TCCTGTTGTT 
CAGGTTCTCA 
GCGGCAGTTT 
CAAACCCAAA 
CCGCCGAATC 
GATTCCGCCC 
TGGAAGGGTT 
ATATAACGTT 
TT GGATGAAG 
TGAACGAATT 
ATTTGGTTGC 
ATTTATAAAG 
TGCACGGTCG 
ATCAGGCGGA 
CATTCCGGCA 



CCGATGTTAA 
GC T GAAAAAG 
AGGACAGGGC 
C GGC AGAAAA 
AATGAAGACG 
CGCAAATCAA 
CCGCGTCAAA 
GATTTGGCTA 
GACCCACTGT 
AAGCACCGTC 
GAGAAATATA 
GACAGCAGTT 
ACAAGTCCGC 
AGGAGGTCGC 
T AC GC TGATT 
ATATCTTCGC 



AEKETEVKED 
NEDEGPQNDM 
DLANGVDIDG 
EKYKKDGKSD 
RRSLPAEMPL 
KLPGGSYALR 
KVDFGSKSVD 
FYGPAGEEVA 
PQPDTSVING 
SLQSFRLGCA 
NGSLAGTVTG 
ALVRI RQTGK 
HTRNQINGGA 
ADKNEHPYVS 
SYIFFRELAG 
VTRKALNRLI 
LrL PNGMKPE Y 



ATCGGCGGAC 
AGACAGAGGT 
GCGCCATCCA 
TACAGGCAAT 
AGGGACCGCA 
ACAGGGAACA 
CCCTGCACCT 
ATGGCGTTTT 
AAAGGC GAT T 
AAAATCAGAA 
AGAAAGATGG 
CAAGCTAATG 
TTCATCTTCA 
TTCCTGCCGA 
GTCGATGGGG 
GCCCGAAGGG 



APQAGSQGQG 
PQNSAESANQ 
PSQNITLTHC 
KFTNLVATAV 
IPVNQADTLI 
VQGEPAKGEM 
GIIDSGDDLH 
GKYSYRPTDA 
PDRPVGIPDP 
NLKNRQGWQD 
YYEPVLKGDD 
NSGTIDNTGG 
LDGKAPILGY 
IGRYMADKGY 
SSNDGPVGAL 
MAQDTGSAIK 
RP* 



ACGCTGTCAA 
AAAAGAAGAT 
CACAAGGCAG 
GGCGGTGCGG 
AAATGATATG 
ACCAACCCGC 
GCGAATGGCG 
GATTGATGGG 
CTTGTAATGG 
TT TGAAAAT T 
GAAAAGCGAT 
GAACTAACAA 
TCTGCGCGAT 
GATGCCGCTA 
AAGCGGTCAG 
AATTACCGGT 



APSTQGSQDM 
TGNNQPADSS 
KGDSCNGDNL 
QANGTNKYVI 
VDGEAVSLTG 
LAGTAVYNGE 
MGTQKFKAAI 
EKGGFGVFAG 
AGTTVGGGGA 
VCAQAFQTPV 
RRTAQARFPI 
THTADLSRFP 
AEDPVELFFM 
LKLGQTSMQG 
GTPLMGEYAG 
GAVRVDYFWG 



AACCGGCCGC 
GCGCCACAGG 
CCAAGATATG 
CAACAACGGA 
CCGCAAAATT 
CGATTCTTCA 
GTAGCAATTT 
CCGTCGCAAA 
TGATAATTTA 
TAAATGAGTC 
AAATTT AC TA 
ATATGTCATC 
TCAGGCGTTC 
ATCCCCGTCA 
CCTGACGGGG 
ATCTGACTTA 
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10 



15 



20 



25 



30 



801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 



CGGGGCGGAA 
AAC CGGC AAA 
GTGCTGCATT 
GTTTGCCGCA 
ACAGCGGCGA 
GATGGAAACG 
TTCCGGAAGG 
GCTATCGCCC 
AAAAAAGAGC 
CGAAT AT C AC 
CCAACGTCGG 
GCAAAACGCG 
AAGCGGTTCG 
ATGCCGCCCA 
AACGGCAAAA 
AACCGCCCCC 
CGATGGCGAA 
CGCACCAAAT 
CGTCCGCATC 

MAS PDVKSAD 
AAVSAENTGN 
DSAPASNPAP 
LDEEAPSKSE 
IYKDKSASSS 
HSGNIFAPEG 
VLHFHT ENGR 
DGNGFKGTWT 
KKEQDGSGGG 
AKRDGKIDIT 
NGKKLVSVDG 
RTKWGVDYLV 



AAATTGCC CG 
AGGCGAAATG 
TTCATACGGA 
AAAGTCGATT 
TGATTTGCAT 
GCTTTAAGGG 
TTTTACGGCC 
GACAGATGCG 
AGGATGGATC 
GCCAACGCCC 
CGGTTTTTAC 
ACGGTAAAAT 
CAACACTTTA 
ATATCCGGAC 
AACTGGTTTC 
GTCAAACTCA 
AACCGAAGTT 
GGGGCGTGGA 
GAC AT C C AAA 

TLSKPAAPW 
GGAATTDKPK 
ANGGSNFGRV 
FENLNESERI 
SARFRRSARS 
NYRYLTYGAE 
PYPTRGRFAA 
ENGGGDVSGR 
GATYRVDEYH 
IPVANLQSGS 
NLTMHGKTAP 
NVGMTKSVRI 



GCGGATCGTA 
CTTGCTGGCA 
AAACGGCCGT 
TCGGCAGCAA 
ATGGGTACGC 
GACTTGGACG 
CGGCCGGCGA 
GAAAAGGGCG 
CGGAGGAGGA 
GTTTCGCCAT 
GGTCTGACCG 
CGACATCACC 
CCGACCACCT 
ATCCGCTTTG 
CGTTGACGGC 
AAGCCGAAAA 
TGCGGCGGCG 
CTACCTCGTT 
TCGAGGCAGC 

AEKETEVKED 
NEDEGPQNDM 
DLANGVLIDG 
EKYKKDGKSD 
RRSLPAEMPL 
KLPGGSYALR 
KVDFGSKSVD 
FYGPAGEEVA 
ANARFAIDHF 
QHFTDHLKSA 
VKLKAEKFNC 
DIQIEAAKQ* 



TGCCCTCCGT 
CGGCCGTGTA 
CCGTACCCGA 
ATCTGTGGAC 
AAAAATTCAA 
GAAAATGGCG 
GGAAGTGGCG 
GATTCGGCGT 
GGAGCCACCT 
C GAC CAT T TC 
GTTCCGTCGA 
ATCCCCGTTG 
GAAATCAGCC 
TTTCCACCAA 
AACCTGACCA 
ATTCAACTGC 
ACTTCAGCAC 
AACGTTGGTA 
CAAACAATAA 

APQAGSQGQG 
PQNSAESANQ 
PSQNITLTHC 
KFTNLVATAV 
IPVNQADTLI 
VQGE PAKGEM 
GIIDSGDDLH 
GKYSYRPTDA 
NTSTNVGGFY 
DIFDAAQYPD 
YQSPMAKTEV 



GTGCAAGGCG 
CAACGGCGAA 
C T AGAGGC AG 
GGCATTATCG 
AGCCGCCATC 
GCGGGGATGT 
GGAAAATACA 
GTTTGCCGGC 
ACAAAGTGGA 
AACACCAGCA 
GTTCGACCAA 
CCAACCTGCA 
GACATCTTCG 
ATTCAACTTC 
TGCACGGCAA 
TACCAAAGCC 
CACCATCGAC 
TGACCAAAAG 
CTCGAG 

APSTQGSQDM 
TGNNQPADSS 
KGDSCNGDNL 
QANGTNKYVI 
VDGEAVSLTG 
LAGTAVYNGE 
MGTQKFKAAI 
EKGGFGVFAG 
GLTGSVEFDQ 
IRFVSTKFNF 
CGGDFSTTID 
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AG2 87-961 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 



ATGGCTAGCC 
TCCTGTTGTT 
CAGGTTCTCA 
GCGGCAGTTT 
C AAAC C C AAA 
CCGCCGAATC 
GATTCCGCCC 
TGGAAGGGTT 
AT AT AAC GT T 
TTGGATGAAG 
TGAAC GAATT 
ATTTGGTTGC 
ATTTATAAAG 
TGCACGGTCG 
ATCAGGCGGA 
CATTCCGGCA 
CGGGGCGGAA 
AAC C GGC AAA 
GTGCTGCATT 
GTTTGCCGCA 
ACAGCGGCGA 
GATGGAAACG 
TTCCGGAAGG 
GCTATCGCCC 
AAAAAAGAGC 
TGTTAAAAAA 
AAGAAATCAA 
GAC GGC AC AA 
C GAC TTT AAA 
CCGTCAATGA 
T C TGAAAT AG 
AGCAGATACT 



CCGATGTTAA 
GCTGAAAAAG 
AGGACAGGGC 
CGGCAGAAAA 
AATGAAGACG 
CGCAAATCAA 
CCGCGTCAAA 
GATTTGGCTA 
GACCCACTGT 
AAGCACCGTC 
GAGAAATATA 
GACAGCAGTT 
AC AAGTC CGC 
AGGAGGT CGC 
TACGCTGATT 
ATATCTTCGC 
AAATTGCCCG 
AGGCGAAATG 
TTCATACGGA 
AAAGTCGATT 
TGATTTGCAT 
GCTTTAAGGG 
TTTTACGGCC 
GACAGATGCG 
AGGATGGATC 
GCTGCCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 



ATCGGCGGAC 
AGACAGAGGT 
GCGCCATCCA 
TACAGGCAAT 
AGGG AC CGC A 
ACAGGGAACA 
CCCTGCACCT 
ATGGCGTTTT 
AAAGGCGATT 
AAAATCAGAA 
AGAAAGATGG 
CAAGCTAATG 
TTCATCTTCA 
TTCCTGCCGA 
GTCGATGGGG 
GCCCGAAGGG 
GCGGATCGTA 
CTTGCTGGCA 
AAACGGCCGT 
TCGGCAGCAA 
ATGGGTACGC 
GACTTGGACG 
CGGCCGGCGA 
GAAAAGGGCG 
CGGAGGAGGA 
TGGCCATTGC 
GCTGGAGAGA 
AG AC GC AAC T 
TGAAAAAAGT 
AACGTCGATG 
AAC C AAGTT A 
TGGATGCAAC 



ACGCTGTCAA 
AAAAGAAGAT 
CACAAGGCAG 
GGCGGTGCGG 
AAATGATATG 
ACCAACCCGC 
GCGAATGGCG 
GATTGATGGG 
CTTGTAATGG 
TTTGAAAATT 
GAAAAGC GAT 
G AAC T AAC AA 
TCTGCGCGAT 
GATGCCGCTA 
AAGCGGTCAG 
AATTACCGGT 
TGCCCTCCGT 
CGGCCGTGTA 
CCGTACCCGA 
ATCTGTGGAC 
AAAAATTCAA 
GAAAATGGCG 
GGAAGTGGCG 
GATTCGGCGT 
GGAGCCACAA 
TGCTGCCTAC 
CCATCTACGA 
GCAGCCGATG 
CGTGACTAAC 
CCAAAGTAAA 
GCAGACACTG 
CACCAACGCC 



AACCGGCCGC 
GCGCCACAGG 
CCAAGATATG 
C AAC AAC GG A 
CCGCAAAATT 
CGATTCTTCA 
GTAGCAATTT 
CCGTCGCAAA 
TGATAATTTA 
TAAATGAGTC 
AAATTTACTA 
AT ATGTC AT C 
TCAGGCGTTC 
ATCCCCGTCA 
CCTGACGGGG 
ATCTGACTTA 
GTGCAAGGCG 
CAACGGCGAA 
C T AGAGGC AG 
GGCATTATCG 
AGCCGCCATC 
GCGGGGATGT 
GGAAAATACA 
GTTTGCCGGC 
ACGACGACGA 
AAC AATGGC C 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
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1601 TGGGAGAAAA TATAACGACA TTTGCTGAAG AG AC T AAG AC AAATATCGTA 

1651 AAAATTGATG AAAAATTAGA AGCCGTGGCT GATACCGTCG ACAAGCATGC 

17 01 CGAAGCATTC AACGATATCG CCGATTCATT GGATGAAACC AACACTAAGG 

17 51 CAGACGAAGC CGTCAAAACC GCCAATGAAG CCAAACAGAC GGCCGAAGAA 

1801 AC C AAAC AAA ACGTCGATGC CAAAGTAAAA GCTGCAGAAA CTGCAGCAGG 

1851 CAAAGCCGAA GCTGCCGCTG GCACAGCTAA TACTGCAGCC GACAAGGCCG 

1901 AAGCTGTCGC TGCAAAAGTT AC CGAC AT C A AAGCTGATAT CGCTACGAAC 

1951 AAAGATAATA TTGCTAAAAA AGCAAACAGT GCCGACGTGT AC AC C AG AG A 

2 001 AGAGTCTGAC AGCAAATTTG TCAGAATTGA TGGTCTGAAC GCTACTACCG 

2 051 AAAAATTGGA CACACGCTTG GCTTCTGCTG AAAAATC CAT TGCCGATCAC 

2101 GAT AC T C GC C TGAACGGTTT GGATAAAACA GTGTC AGAC C TGCGCAAAGA 

2151 AACCCGCCAA GGCCTTGCAG AACAAGCCGC GCTCTCCGGT CTGTTCCAAC 

22 01 CTTACAACGT GGGTCGGTTC AATGTAACGG CTGCAGTCGG CGGCTACAAA 

22 51 TCCGAATCGG CAGTCGCCAT CGGTACCGGC TTCCGCTTTA CCGAAAACTT 

23 01 TGCCGCCAAA GCAGGCGTGG CAGTCGGCAC TTCGTCCGGT TCTTCCGCAG 
2351 CCTACCATGT CGGCGTCAAT TACGAGTGGT AACTCGAG 

1 MAS PDVKSAD TLSKPAAPW AEKETEVKED APQAGSQGQG APSTQGSQDM 

51 AAVSAENTGN GGAATTDKPK NEDEGPQNDM PQNSAESANQ TGNNQPADSS 

101 DSAPASNPAP ANGGSNFGRV DLANGVLIDG PSQNITLTHC KGDSCNGDNL 

151 LDEEAPSKSE FENLNESERI EKYKKDGKSD KFTNLVATAV QANGTNKYVI 

2 01 IYKDKSASSS SARFRRSARS RRSLPAEMPL IPVNQADTLI VDGEAVSLTG 

2 51 HSGNIFAPEG NYRYLTYGAE KLPGGSYALR VQGEPAKGEM LAGTAVYNGE 

3 01 VLHFHTENGR PYPTRGRFAA KVDFGSKSVD GIIDSGDDLH MGTQKFKAAI 
351 DGNGFKGTWT ENGGGDVSGR FYGPAGEEVA GKYSYRPTDA EKGGFGVFAG 
401 KKEQDGSGGG GATNDDDVKK AATVA I AAAY NNGQEINGFK AGETIYDIDE 
451 DGTITKKDAT AADVEADDFK GLGLKKWTN IiTKTVNENKQ NVDAKVKAAE 
501 SEIEKLTTKL ADTDAALADT DAALDATTMA LNKLGENITT FAEETKTNIV 
551 KIDEKLEAVA DTVDKHAEAF NDIADSLDET NTKADEAVKT ANEAKQTAEE 
601 TKQNVDAKVK AAETAAGKAE AAAGTANTAA DKAEAVAAKV TDIKADIATN 
651 KDNIAKKANS ADVYTREESD SKFVRIDGLN ATTEKLDTRL ASAEKSIADH 
701 DTRLNGLDKT VSDLRKETRQ GLAEQAALSG LFQPYNVGRF NVTAAVGGYK 
751 SESAVAIGTG FRFTENFAAK AGVAVGTSSG SSAAYHVGVN YEW* 





ELISA 


Bactericidal 


AG287-953-His 


3834 


65536 


AG287-961-His 


108627 


65536 



The bactericidal efficacy (homologous strain) of antibodies raised against the hybrid proteins 
was compared with antibodies raised against simple mixtures of the component antigens 
(using 287-GST) for 919 and ORF46.1: 





Mixture with 287 


Hybrid with AG287 


919 


32000 


128000 


ORF46.1 


128 


16000 



Data for bactericidal activity against heterologous MenB strains and against serotypes A and 
C were also obtained: 
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919 


ORF46.1 


Strain 


Mixture 


TIL .L,. 1 J 

Hybrid 


Mixture 


Hybrid 


IN Ij.ti.j5 


1024 


32000 




16384 


MC58 


512 


8192 




512 


BZ232 


512 


512 






MenA (F6124) 


512 


32000 




8192 


MenC (Cll) 


>2048 


>2048 






MenC (BZ133) 


>4096 


64000 




8192 



The hybrid proteins with AG287 at the N-terminus are therefore immunologically superior to 
simple mixtures, with AG287-ORF46.1 being particularly effective, even against 

heterologous strains. AG287-ORF46. IK may be expressed in pET-24b. 

The same hybrid proteins were made using New Zealand strain 394/98 rather than 2996: 

AG287NZ-919 

1 ATGGCTAGCC CCGATGTCAA GTCGGCGGAC ACGCTGTCAA AACCTGCCGC 

51 CCCTGTTGTT TCTGAAAAAG AGACAGAGGC AAAGGAAGAT GCGCCACAGG 

101 C AGGTTCT C A AGGACAGGGC GCGCCATCCG CACAAGGCGG TCAAGATATG 

151 GCGGCGGTTT CGGAAGAAAA TACAGGCAAT GGCGGTGCGG CAGCAACGGA 

2 01 CAAACCCAAA AATGAAGACG AGGGGGCGCA AAATGATATG CCGCAAAATG 

2 51 CCGCCGATAC AGATAGTTTG ACACCGAATC ACACCCCGGC TTCGAATATG 

3 01 CCGGCCGGAA ATATGGAAAA CCAAGCACCG GATGCCGGGG AATCGGAGCA 
3 51 GCCGGCAAAC CAACCGGATA TGGCAAATAC GGCGGACGGA ATGCAGGGTG 
401 ACGATCCGTC GGCAGGCGGG GAAAATGCCG GCAATACGGC TGCCCAAGGT 
451 ACAAATCAAG CCGAAAACAA TCAAACCGCC GGTTCTCAAA ATCCTGCCTC 
501 TTCAACCAAT CCTAGCGCCA CGAATAGCGG TGGTGATTTT GGAAGGACGA 
551 ACGTGGGCAA TTCTGTTGTG ATTGACGGGC CGTCGCAAAA TATAACGTTG 
601 ACCCACTGTA AAGGCGATTC TTGTAGTGGC AATAATTTCT TGGATGAAGA 
651 AGTACAGCTA AAATCAGAAT TTGAAAAATT AAGTGATGCA GACAAAATAA 

7 01 GTAATTACAA GAAAGATGGG AAGAATGACG GGAAGAATGA TAAATTTGTC 
751 GGTTTGGTTG CCGATAGTGT GCAGATGAAG GGAATCAATC AAT AT AT TAT 

8 01 CTTTTATAAA CCTAAACCCA CTTCATTTGC GCGATTTAGG CGTTCTGCAC 
851 GGTCGAGGCG GTCGCTTCCG GCCGAGATGC CGCTGATTCC CGTCAATCAG 

9 01 GCGGATACGC TGATTGTCGA TGGGGAAGCG GTCAGCCTGA CGGGGCATTC 
951 CGGCAATATC TTCGCGCCCG AAGGGAATTA CCGGTATCTG ACTTACGGGG 

1001 CGGAAAAATT GCCCGGCGGA TCGTATGCCC TCCGTGTTCA AGGCGAACCT 

1051 TCAAAAGGCG AAATGCTCGC GGGCACGGCA GTGTACAACG GCGAAGTGCT 

1101 GCATTTTCAT ACGGAAAACG GCCGTCCGTC CCCGTCCAGA GGCAGGTTTG 

1151 CCGCAAAAGT CGATTTCGGC AGCAAATCTG TGGACGGCAT TATCGACAGC 

12 01 GGCGATGGTT TGCATATGGG TACGCAAAAA TTCAAAGCCG CCATCGATGG 

12 51 AAACGGCTTT AAGGGGACTT GGACGGAAAA TGGCGGCGGG GATGTTTCCG 

13 01 GAAAGTTTTA CGGCCCGGCC GGCGAGGAAG TGGCGGGAAA ATACAGCTAT 
1351 CGCCCAACAG ATGCGGAAAA GGGCGGATTC GGCGTGTTTG C CGGC AAAAA 
1401 AGAGCAGGAT GGATCCGGAG GAGGAGGATG CCAAAGCAAG AGC ATC C AAA 
1451 CCTTTCCGCA ACCCGACACA TCCGTCATCA ACGGCCCGGA CCGGCCGGTC 
1501 GGCATCCCCG ACCCCGCCGG AACGACGGTC GGCGGCGGCG GGGCCGTCTA 
1551 TACCGTTGTA CCGCACCTGT CCCTGCCCCA CTGGGCGGCG CAGGATTTCG 

16 01 CCAAAAGCCT GCAATCCTTC CGCCTCGGCT GCGCCAATTT GAAAAAG CGC 
1651 CAAGGCTGGC AGGAT GTGTG CGCCCAAGCC TTTCAAACCC CCGTCCATTC 

17 01 CTTTCAGGCA AAACAGTTTT TTGAACGCTA TTTCACGCCG TGGCAGGTTG 
1751 CAGGCAACGG AAGCCTTGCC GGTACGGTTA CCGGCTATTA CGAGCCGGTG 

18 01 CTGAAGGGCG ACGACAGGCG GACGGCACAA GCCCGCTTCC CGATTTACGG 
1851 TATTCCCGAC GATTTTATCT CCGTCCCCCT GCCTGCCGGT TTGCGGAGCG 
1901 GAAAAGCCCT TGTCCGCATC AGGCAGACGG GAAAAAACAG CGGC AC AAT C 
1951 GACAATACCG GCGGCACACA TACCGCCGAC CTCTCCCGAT TCCCCATCAC 
2 001 CGCGCGCACA ACGGCAATCA AAGGCAGGTT TGAAGGAAGC CGCTTCCTCC 
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10 



15 



20 



25 



30 



2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 

1 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 



CCTACCACAC 
CCGATACTCG 
CCAAGGCTCG 
GCTATGCCGA 
GCGGACAAAG 
AGCCTATATG 
ACCCCAGCTA 
CCCGTCGGCG 
CGACCGGCAC 
ATCCGGTTAC 
GGCAGCGCGA 
CGACGAAGCC 
GGCAGCTCCT 

MAS PDVKSAD 
AAVSEENTGN 
PAGNMENQAP 
TNQAENNQTA 
THCKGDSC SG 
GLVADSVQMK 
ADTLIVDGEA 
SKGEMLAGTA 
GDGLHMGTQK 
RPTDAEKGGF 
GIPDPAGTTV 
QGWQDVCAQA 
LiKGDDRRTAQ 
DNTGGTHTAD 
PILGYAEDPV 
ADKGYLKLGQ 
PVGALGTPLM 
G S A I KG AVRV 



GCGCAACCAA 
GTTACGCCGA 
GGCCGTCTGA 
CAAAAACGAA 
GCTACCTCAA 
CGGCAAAATC 
TATCTTTTTC 
CACTGGGCAC 
TACATTACCT 
CCGCAAAGCC 
TTAAAGGCGC 
GGCGAACTTG 
ACCCAACGGT 

TLSKPAAPW 
GGAAATDK PK 
D AGE S E Q PAN 
GSQNPASSTN 
NNFLDEEVQL 
GINQYIIFYK 
VSLTGHSGWI 
VYNGEVLHFH 
FKAAIDGNGF 
GVFAGKKEQD 
GGGGAVYTW 
FQTPVHSFQA 
ARFPIYGIPD 
LSRFPITART 
ELFFMHIQGS 
TSMQGIKAYM 
GEYAGAVDRH 
DYFWGYGDEA 



ATCAACGGCG 
AGACCCCGTC 
AAACCCCGTC 
CATCCCTACG 
GCTCGGGCAG 
CGCAACGCCT 
CGCGAGCTTG 
GCCGTTGATG 
TGGGCGCGCC 
CTCAACCGCC 
GGTGCGCGTG 
CCGGCAZVACA 
ATGAAGCCCG 

SEKETEAKED 
NEDEGAQNDM 
Q PDMANTADG 
PSATNSGGDF 
KSEFEKLSDA 
PKPTSFARFR 
FAPEGNYRYL 
TENGRPSPSR 
KGTWTENGGG 
GSGGGGCQSK 
PHLSLPHWAA 
KQFFERYFTP 
DFISVPLPAG 
TAIKGRFEGS 
GRLKTPSGKY 
RQNPQRLAEV 
YITLGAPLFV 
GELAGKQKTT 



GCGCGCTTGA 
GAACTTTTTT 
CGGCAAATAC 
TTTCCATCGG 
ACCTCGATGC 
CGCCGAAGTT 
CCGGAAGCAG 
GGGGAATATG 
CTTATTTGTC 
TGATTATGGC 
GATTATTTTT 
GAAAAC C AC G 
AATACCGCCC 

APQAGSQGQG 
PQNAADTDSL 
MQGDDPSAGG 
GRTNVGNSW 
DKISNYKKDG 
RSARSRRSLP 
TYGAEKLPGG 
GRFAAKVDFG 
DVSGKFYGPA 
SIQTFPQPDT 
QDFAKSLQSF 
WQVAGNGSLA 
LRSGKALVRI 
RFLPYHTRNQ 
IRIGYADKNE 
LGQNPSYIFF 
ATAHPVTRKA 
GYVWQLLPNG 



CGGCAAAGCC 
TTATGCACAT 
ATCCGCATCG 
ACGCTATATG 
AGGGCATCAA 
TTGGGTCAAA 
CAATGACGGT 
CCGGCGCAGT 
GCCACCGCCC 
GCAGGATACC 
GGGGATACGG 
GGTTACGTCT 
GTAAAAGCTT 

APSAQGGQDM 
TPNHTPASNM 
ENAGNTAAQG 
IDGPSQNITL 
KNDGKNDKFV 
AEMPLIPVNQ 
SYALRVQGEP 
SKSVDGIIDS 
GEEVAGKYSY 
SVINGPDRPV 
RLGCANLKNR 
GTVTGYYEPV 
RQTGKNSGTI 
X NGGALDGKA 
HPYVS IGRYM 
RELAGSSNDG 
LNRLIMAQDT 
MKPEYRP* 
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AG287NZ-953 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 



ATGGCTAGCC 
CCCTGTTGTT 
CAGGTTCTCA 
GCGGCGGTTT 
CAAACCCAAA 
CCGCCGATAC 
CCGGCCGGAA 
GCCGGCAAAC 
ACGATCCGTC 
ACAAATCAAG 
TTCAACCAAT 
ACGTGGGCAA 
ACCCACTGTA 
AGTACAGCTA 
GTAATTACAA 
GGTTTGGTTG 
CTTTTATAAA 
GGTCGAGGCG 
GCGGATACGC 
CGGCAATATC 
CGGAAAAATT 
TCAAAAGGCG 
GCATTTTCAT 
CCGCAAAAGT 
GGCGATGGTT 
AAACGGCTTT 
GAAAGT TTT A 
CGCCCAACAG 
AGAGCAGGAT 
ATCACGCCAA 
GTCGGCGGTT 
ACGCGACGGT 



CCGATGTCAA 
TCTGAAAAAG 
AGGACAGGGC 
CGGAAGAAAA 
AATGAAGACG 
AGATAGTTTG 
ATATGGAAAA 
CAACCGGATA 
GGCAGGCGGG 
CCGAAAACAA 
CCTAGCGCCA 
TTCTGTTGTG 
AAGGCGATTC 
AAATCAGAAT 
GAAAGATGGG 
CCGATAGTGT 
CCTAAACCCA 
GTCGCTTCCG 
TGATTGTCGA 
TTCGCGCCCG 
GCCCGGCGGA 
AAATGCTCGC 
ACGGAAAACG 
CGATTTCGGC 
TGCATATGGG 
AAGGGGACTT 
CGGCCCGGCC 
ATGCGGAAAA 
GGATCCGGAG 
CGCCCGTTTC 
TTTACGGTCT 
AAAATCGACA 



GTCGGCGGAC 
AGACAGAGGC 
GCGCCATCCG 
TACAGGCAAT 
AGGGGGCGCA 
ACACCGAATC 
CCAAGCACCG 
TGGCAAATAC 
GAAAATGCCG 
TCAAACCGCC 
CGAATAGCGG 
ATTGACGGGC 
TTGTAGTGGC 
TTGAAAAATT 
AAGAATGACG 
GCAGATGAAG 
CTTCATTTGC 
GCCGAGATGC 
TGGGGAAGCG 
AAGGGAATTA 
TCGTATGCCC 
GGGCACGGCA 
GCCGTCCGTC 
AGCAAATCTG 
TACGCAAAAA 
GGAC GGAAAA 
GGCGAGGAAG 
GGGCGGATTC 
GAGGAGGAGC 
GCCATCGACC 
GACCGGTTCC 
TCACCATCCC 



ACGCTGTCAA 
AAAGGAAGAT 
CACAAGGCGG 
GGCGGTGCGG 
AAATGATATG 
ACACCCCGGC 
GATGCCGGGG 
GGCGGACGGA 
GCAATACGGC 
GGTTCTCAAA 
TGGTGATTTT 
CGTCGCAAAA 
AATAATTTCT 
AAGTGATGCA 
GGAAGAATGA 
GGAATCAATC 
GCGATTTAGG 
CGCTGATTCC 
GTCAGCCTGA 
CCGGTATCTG 
TCCGTGTTCA 
GTGTACAACG 
CCCGTCCAGA 
TGGACGGCAT 
TTCAAAGCCG 
TGGCGGCGGG 
TGGCGGGAAA 
GGCGTGTTTG 
C AC C TAC AAA 
ATTTCAACAC 
GTCGAGTTCG 
CGTTGCCAAC 



AACCTGCCGC 

GCGCCACAGG 

TCAAGATATG 

CAGCAACGGA , 

CCGCAAAATG 

TTCGAATATG 

AATCGGAGCA 

ATGCAGGGTG 

TGCCCAAGGT 

ATCCTGCCTC 

GGAAGGACGA 

TATAACGTTG 

TGGATGAAGA 

GACAAAATAA 

TAAATTTGTC 

AAT AT AT TAT 

CGTTCTGCAG 

CGTCAATCAG 

CGGGGCATTC 

ACTTACGGGG 

AGGCGAACCT 

GCGAAGTGCT 

GGCAGGTTTG 

TATCGACAGC 

C CAT C GATGG 

GATGTTTCCG 

ATACAGCTAT 

CCGGCAAAAA 

GTGGACGAAT 

CAGCACCAAC 

ACCAAGCAAA 

CTGCAAAGCG 
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1601 GTTCGCAACA CTTTACCGAC CACCTGAAAT CAGCCGACAT CTTCGATGCC 

1651 GCCCAATATC CGGACATCCG CTTTGTTTCC ACCAAATTCA ACTTCAACGG 

1701 CAAAAAACTG GTTTCCGTTG ACGGCAACCT GACCATGCAC GGCAAAACCG 

1751 CCCCCGTCAA ACTCAAAGCC GAAAAATTCA ACTGCTACCA AAGCCCGATG 

5 1801 GCGAAAACCG AAGTTTGCGG CGGCGACTTC AGCACCACCA TCGACCGCAC 

1851 CAAATGGGGC GTGGACTACC TCGTTAACGT TGGTATGACC AAAAGCGTCC 

1901 GC AT CG AC AT CCAAATCGAG GCAGCCAAAC AATAAAAGCT T 

1 MASPDVKSAD TLSKPAAPW SEKETEAKED APQAGSQGQG APSAQGGQDM 

10 51 AAVSEENTGN GGAAATDKPK NEDEGAQNDM PQNAADTDSL TPNHTPASNM 

101 PAGNMENQAP DAGESEQPAN QPDMANTADG MQGDDPSAGG ENAGNTAAQG 

151 TNQAENNQTA GSQNPASSTN PSATNSGGDF GRTNVGNSW IDGPSQNITL 

2 01 THCKGDSCSG NNFLDEEVQL KSEFEKLSDA DKISNYKKDG KNDGKNDKFV 
251 GLVADSVQMK GINQYIIFYK PKPTSFARFR RSARSRRSLP AEMPLIPVNQ 

15 3 01 ADTLIVDGEA VSLTGHSGNI FAPEGNYRYL TYGAEKLPGG SYALRVQGEP 

351 SKGEMLAGTA VYNGEVIiHFH TENGRPSPSR GRFAAKVDFG SKSVDGIIDS 

401 GDGLHMGTQK FKAAIDGNGF KGTWTENGGG DVSGKFYGPA GEEVAGKYSY 

451 RPTDAEKGGF GVFAGKKEQD GSGGGGATYK VDEYHANARF AIDHFNTSTN 

501 VGGFYGLTGS VEFDQAKRDG KIDITIPVADJ LQSGSQHFTD HLKSADIFDA 

20 551 AQYPD I RFVS TKFNFNGKKL VSVDGNLTMH GKTAPVKLKA EKFNCYQSPM 

601 AKTEVCGGDF STTIDRTKWG VDYLVNVGMT KSVRIDIQIE AAKQ* 

AG287NZ-961 

25 1 ATGGCTAGCC CCGATGTCAA GTCGGCGGAC ACGCTGTCAA AACCTGCCGC 

51 CCCTGTTGTT TCTGAAAAAG AGACAGAGGC AAAGGAAGAT GCGCCACAGG 

101 CAGGTTCTCA AGGACAGGGC GCGCCATCCG CACAAGGCGG TCAAGATATG 

151 GCGGCGGTTT CGGAAGAAAA TACAGGCAAT GGCGGTGCGG CAGCAACGGA 

201 CAAACCCAAA AATGAAGACG AGGGGGCGCA AAATGATATG CCGCAAAATG 

30 251 CCGCCGATAC AGATAGTTTG AC AC CGAAT C ACACCCCGGC TTCGAATATG 

3 01 CCGGCCGGAA ATATGGAAAA CCAAGCACCG GATGCCGGGG AATCGGAGCA 
3 51 GCCGGCAAAC CAACCGGATA TGGCAAATAC GGCGGACGGA ATGCAGGGTG 
401 ACGATCCGTC GGCAGGCGGG GAAAATGCCG GCAATACGGC TGCCCAAGGT 
451 ACAAATCAAG CCGAAAACAA TCAAACCGCC GGTTCTCAAA ATCCTGCCTC 

35 501 TTCAACCAAT CCTAGCGCCA CGAATAGCGG TGGTGATTTT GGAAGGACGA 

551 ACGTGGGCAA TTCTGTTGTG ATTGACGGGC CGTCGCAAAA TATAACGTTG 

601 ACCCACTGTA AAGGCGATTC TTGTAGTGGC AATAATTTCT TGGATGAAGA 

651 AGTACAGCTA AAATCAGAAT TTGAAAAATT AAGTGATGCA GACAAAATAA 

701 GTAATTACAA GAAAG AT GGG AAGAATGACG GGAAGAATGA TAAATTTGTC 

40 7 51 GGTTTGGTTG CCGATAGTGT GCAGATGAAG GGAATCAATC AATATATTAT 

801 CTTTTATAAA CCTAAACCCA CTTCATTTGC GCGATTTAGG CGTTCTGCAC 

851 GGTCGAGGCG GTCGCTTCCG GC C GAGATGC CGCTGATTCC CGTCAATCAG 

901 GCGGATACGC TGATTGTCGA TGGGGAAGCG GTCAGCCTGA CGGGGCATTC 

951 CGGCAATATC TTCGCGCCCG AAGGGAATTA CCGGTATCTG ACTTACGGGG 

45 1001 CGGAAAAATT GCCCGGCGGA TCGTATGCCC TCCGTGTTCA AGGCGAACCT 

1051 TCAAAAGGCG AAATGCTCGC GGGCACGGCA GTGTACAACG GCGAAGTGCT 

1101 GCATTTTCAT ACGGAAAACG GCCGTCCGTC CCCGTCCAGA GGCAGGTTTG 

1151 CCGCAAAAGT CGATTTCGGC AGCAAATCTG TGGAC GGC AT TATCGACAGC 

12 01 GGCGATGGTT TGCATATGGG TACGCAAAAA TTCAAAGCCG CCATCGATGG 
50 1251 AAACGGCTTT AAGGGG AC T T GGACGGAAAA TGGCGGCGGG GATGTTTCCG 

13 01 GAAAGTTTTA CGGCCCGGCC GGCGAGGAAG TGGCGGGAAA AT AC AGC T AT 
13 51 CGCCCAACAG ATGCGGAAAA GGGCGGATTC GGCGTGTTTG CCGGCAAAAA 
1401 AGAGCAGGAT GGATCCGGAG GAGGAGGAGC CACAAACGAC GACGATGTTA 
1451 AAAAAGC TGC C AC TGTGGC C ATTGCTGCTG CCTACAACAA TGGCCAAGAA 

55 1501 ATCAACGGTT TCAAAGCTGG AGAGAC CATC TACGACATTG AT GAAGAC GG 

1551 CACAATTACC AAAAAAGACG CAACTGCAGC CGATGTTGAA GC C G AC G AC T 

1601 TTAAAGGTCT GGGTCTGAAA AAAGTCGTGA CTAACCTGAC CAAAACCGTC 

1651 AATGAAAACA AACAAAACGT C GAT GC C AAA GTAAAAGCTG CAGAATCTGA 

1701 AATAGAAAAG TTAACAACCA AGTTAGCAGA C ACT GATGC C GCTTTAGCAG 

60 1751 AT AC TGATGC CGCTCTGGAT GCAACCACCA ACGCCTTGAA TAAATTGGGA 

18 01 GAAAATATAA CGACATTTGC TGAAGAGACT AAGACAAATA TCGTAAAAAT 

1851 TGATGAAAAA TTAGAAGCCG TGGCTGATAC CGTCGACAAG CATGCCGAAG 

1901 CATTCAACGA TATCGCCGAT TCATTGGATG AAACCAACAC TAAGGCAGAC 

1951 GAAGCCGTCA AAACCGCCAA TGAAGC C AAA C AGACGGC CG AAGAAACCAA 

65 2 001 ACAAAACGTC GATGC CAAAG TAAAAGCTGC AGAAACTGCA GCAGGCAAAG 

2 051 CCGAAGCTGC CGCTGGCACA GCTAATACTG CAGCCGACAA GGCCGAAGCT 

2101 GTCGCTGCAA AAGTTACCGA CATCAAAGCT GATATCGCTA CGAACAAAGA 
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2151 TAATATTGCT AAAAAAGCAA ACAGTGCCGA CGTGTACACC AGAGAAGAGT 

22 01 CTGACAGCAA ATTTGTCAGA ATTGATGGTC TGAACGCTAC TACCGAAAAA 
2 251 TTGGACACAC GCTTGGCTTC TGCTGAAAAA TCCATTGCCG ATCACGATAC 

23 01 TCGCCTGAAC GGTTTGGATA AAACAGTGTC AGACCTGCGC AAAGAAACCC 
2351 GCCAAGGCCT TGCAGAACAA GCCGCGCTCT CCGGTCTGTT CCAACCTTAC 
2401 AACGTGGGTC GGTTCAATGT AACGGCTGCA GTCGGCGGCT ACAAATCCGA 
2 451 ATCGGCAGTC GCCATCGGTA CCGGCTTCCG CTTTACCGAA AACTTTGCCG 
2 501 CCAAAGCAGG CGTGGCAGTC GGCACTTCGT CCGGTTCTTC CGCAGCCTAC 
2 551 CATGTCGGCG TCAATTACGA GTGGTAAAAG CTT 



1 MAS PDVK SAD TLSKPAAPW SEKETEAKED APQAGSQGQG APSAQGGQDM 

51 AAVSEENTGN GGAAATDKPK NEDEGAQNDM PQNAADTDSL TPNHTPASNM 

101 PAGNMENQAP DAGE S EQ PAN QPDMANTADG MQGDDPSAGG ENAGNTAAQG 

151 TNQAENNQTA GSQNPASSTN PSATNSGGDF GRTNVGNSW IDGPSQNITL 

2 01 THCKGDSCSG NNFLDEEVQL KSEFEKLSDA DKISNYKKDG KNDGKNDKFV 

2 51 GLVADSVQMK GINQYIIFYK PKPTSFARFR RSARSRRSLP AEMPLIPVNQ 

3 01 ADTLIVDGEA VSLTGHSGNI F APE GNYRYL TYGAEKLPGG SYALRVQGEP 
3 51 SKGEMLAGTA VYNGEVLHFH TENGRPSPSR GRFAAKVDFG SKSVDGIIDS 
401 GDGLHMGTQK FKAAIDGNGF KGTWTENGGG DVSGKFYGPA GEEVAGKYSY 
451 RPTDAEKGGF GVFAGKKEQD GSGGGGATND DDVKKAATVA IAAAYNNGQE 
501 INGFKAGETI YDIDEDGTIT KKDATAADVE ADDFKGLGLK KWTNLTKTV 
551 NENKQNVDAK VKAAESEIEK LTTKLADTDA ALADTDAALD ATTNALNKLG 
601 ENITTFAEET KTNIVKIDEK LEAVADTVDK HAEAFNDIAD SLDETNTKAD 
651 EAVKTANEAK QTAEETKQW DAKVKAAETA AGKAEAAAGT ANTAADKAEA 
701 VAAKVTDIKA DIATNKDNIA KKANSADVYT REESDSKFVR IDGLNATTEK 
751 LDTRLASAEK SIADHDTRLN GLDKTVSDLR KETRQGLAEQ AALSGLFQPY 
801 NVGRFNVTAA VGGYKSESAV AIGTGFRFTE NFAAKAGVAV GTSSGSSAAY 
851 HVGVNYEW* 

AG983 and hybrids 

Bactericidal titres generated in response to AG983 (His-fusion) were measured against 
various strains, including the homologous 2996 strain: 





2996 


NGH38 


BZ133 


AG983 


512 


128 


128 



AG983 was also expressed as a hybrid, with ORF46.1, 741, 961 or 961c at its C-terminus: 

AG983-ORF46.1 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 



ATGACTTCTG 
CAGCAGAGCA 
AGAACGAAAT 
GTTGCGGTTA 
GCATACCGGA 
ACCTCAAACC 
GGTATCGTCG 
GTATGGCAGA 
CGTATATGCG 
GCTTCTTTCG 
TATCCGCCAC 
TTGGCGGGCG 
GCGACGCTAC 
GGTTGCAGCC 
GCATCGTCAA 
CTTTTCCAAA 
CTATTCCGGC 
GCGATTACGG 
ATCTTTTCGA 
ATTGCCATTT 
GCGTAGACCG 
GGTACAGAAC 
GTGGTGCCTG 



CGCCCGACTT 
ACAACAGCGA 
GTGCAAAGAC 
CAGACAGGGA 
GACTTTCCAA 
TGCAATTGAA 
ACACAGGCGA 
AAAGAACACG 
GAAGGAAGCG 
ACGATGAGGC 
GTAAAAGAAA 
TTCCGTGGAC 
ACATAATGAA 
ATCCGCAATG 
TAACAGTTTT 
TAGCCAATTC 
GGTGATAAAA 
CAACCTGTCC 
CAGGCAATGA 
TATGAAAAAG 
CAGTGGAGAA 
CGCTTGAGTA 
TCGGCACCCT 



CAATGCAGGC 
AATCAGCAGC 
AGAAGCATGC 
TGCCAAAATC 
ACCCAAATGA 
GCAGGCTATA 
ATCCGTCGGC 
GCTATAACGA 
CCTGAAGACG 
CGTTATAGAG 
TCGGACACAT 
GGCAGACCTG 
TACGAATGAT 
CATGGGTCAA 
GGAACAACAT 
GGAGGAGCAG 
CAGACGAGGG 
TACCACATCC 
CGCACAAGCT 
ACGCTCAAAA 
AAGTTCAAAC 
TGGCTCCAAC 
ATGAAGCAAG 



GGTACCGGTA 
AGTATCTTAC 
TCTGTGCCGG 
AATGCCCCCC 
CGCATACAAG 
CAGGACGCGG 
AGCATATCCT 
AAATTACAAA 
GAGGC GGTAA 
ACTGAAGCAA 
CGATTTGGTC 
CAGGCGGTAT 
GAAACCAAGA 
GCTGGGCGAA 
CGAGGGCAGG 
TACCGCCAAG 
TATCCGCCTG 
GTAATAAAAA 
CAGCCCAACA 
AGGCATTATC 
GGGAAATGTA 
CATTGCGGAA 
CGTCCGTTTC 



TCGGCAGCAA 
GCCGGTATCA 
TCGGGATGAC 
CCCCGAATCT 
AATTTGATCA 
GGTAGAGGTA 
TTCCCGAACT 
AACTATACGG 
AGACATTGAA 
AGC CGACGGA 
TCCCATATTA 
TGCGCCCGAT 
ACGAAATGAT 
CGTGGCGTGC 
CACTGCCGAC 
CGTTGCTCGA 
ATGCAACAGA 
CATGCTTTTC 
CATATGCCCT 
ACAGTCGCAG 
TGGAGAACCG 
TTACTGCCAT 
ACCCGTACAA 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 
3301 
3351 
3401 
3451 
3501 
3551 
3601 
3651 
3701 
3751 
3801 
3851 
3901 
3951 
4001 
4051 
4101 
4151 
4201 
4251 
4301 
4351 
4401 



ACCCGATTCA 
ACGGCGGCTC 
GCGTACCACG 
ACAGCAAGTT 
CCCGCGTCCT 
CGATATTGCC 
TCAAAAAAGG 
GGCAAAACCA 
ATCGGATATG 
CATCCGGCGG 
GACCAATCCG 
GGACGGCAAA 
ACGGTACGGC 
GGGGCAGGCT 
CGCCAAAATC 
GCGGCCTGCT 
GGCGACACGC 
TTCGGCAGCG 
AGGGCGGCAG 
TCATCCGCAA 
TATGCCGGGC 
TACAGCATGC 
GCTACCGTCT 
CCGCCTGAAA 
GCGTCATCGC 
GTTGAAGGCA 
AACCGGCGAA 
CATGGAGCGA 
GCAGGCATAC 
CTCCTACGGA 
AACATGCGGA 
GGCGGTGTCA 
CGGTCTGCGC 
GTGCTTTGGG 
CTCGCGGGTC 
TGCAACGGCG 
CGGGCGGCTT 
AATATGCCGC 
CGGCAACGGC 
AGTACGGCAA 
GGTGGC GGAG 
GCAGGTTCTC 
TCGGCAGCAG 
AAAATACAAA 
TAAAGGAAAT 
TCCATTCCCC 
GGTAGTCCCG 
CGAACACCAT 
CCGCTCCCAA 
GCCCAAAATA 
GCTTGCCGAC 
GCGACGGATT 
GGCAATGCCG 
CATCGGCGCG 
TAAGC GAAGG 
AC C GAAAAC A 
C AAAGAC TAT 
ATGCCGCACA 
CCCATCAAAG 
C AC GGC AC AT 
AAGGGAAATC 
TACCCGTCCC 
TTACGGCAAA 
AAAATGTCAA 
GACGGTAAAG 
CGAGCACCAC 



AATTGCCGGA 
TGCTGCTGCA 
TTGCTGACGA 
CGGCTGGGGA 
TTCCGTTCGG 
TACTCCTTCC 
CGGCAGCCAA 
TTATC GAAGG 
CGCGTCGAAA 
CAGCCTGAAC 
GCGCAAACGA 
GGTACGCTGT 
GATTATCGGC 
ATCTCAACAG 
GGGCAGGATT 
GGCTTCCCTC 
TGTCCTATTA 
GCACATTCCG 
CAATCTGGAA 
CACCCGAGAC 
ATCCGCCCCT 
GAATGCCGCC 
ATGCCGACAG 
GCCGTATCGG 
GCAAACCCAA 
AAATGCGCGG 
AAT AC G AC AG 
AAACAGTGCA 
GGC AC GAT GC 
CGCTACAAAA 
AGGCAGCGTC 
ACGTTCCGTT 
TACGACCTGC 
CTGGAGCGGC 
TGAAGCTGTC 
GGCGTGGAAC 
TACCGGCGCG 
ACACCCGTCT 
TGGAACGGCT 
CCACAGCGGA 
GCACTGGATC 
GACCGTCAGC 
GGGGGAACTT 
GCCATCAGTT 
ATCGGCTACA 
CTTCGACAAC 
TTGACGGATT 
CCCGCCGACG 
AGGCGCGAGG 
TCCGCCTCAA 
CGTTTCCACA 
CAAACGCGCC 
CCGAAGCCTT 
GCAGGAGAAA 
CTCAAACATT 
AGATGGCGCG 
GCCGCAGCAG 
AGGCATAGAA 
GGATTGGAGC 
CCTATCAAGC 
CGCCGTCAGC 
CTTACCATTC 
GAAAACATCA 
ACTGGCAGAC 
GGTTTCCGAA 
CACCACCACC 



ACATCCTTTT 
GAAATACCCG 
CGGCTCAGGA 
CTGCTGGATG 
CGACTTTACC 
GTAACGACAT 
CTGCAACTGC 
CGGTTCGCTG 
CCAAAGGTGC 
AGCGACGGCA 
AACCGTACAC 
ACACACGTTT 
GGCAAGCTGT 
T AC C GG AC GA 
ATTCTTTCTT 
GACAGCGTCG 
TGTCCGTCGC 
CGCCCGCCGG 
AACCTGATGG 
GGTTGAAACT 
ACGGCGCAAC 
GACGGTGTAC 
TACCGCCGCC 
ACGGGTTGGA 
CAGGACGGTG 
CAGTACCCAA 
CAGCCGCCAC 
AATGCAAAAA 
GGGCGATATC 
ACAGCATCAG 
AACGGCACGC 
TGCCGCAACG 
TCAAACAGGA 
AACAGCCTCA 
GCAACCCTTG 
GCGACCTGAA 
ACTGCAGCAA 
GGTTGCCGGC 
TGGCACGTTA 
CGAGTCGGCG 
CTCAGATTTG 
ATTTCGAACC 
GCCGAGCGCA 
GGGCAACCTG 
TTGTCCGCTT 
CATGCCTCAC 
TAGCCTTTAC 
GCTATGACGG 
GATATATACA 
CCTGACCGAC 
ATGCCGGTAG 
ACCCGATACA 
CAACGGCACT 
TTGTCGGCGC 
GCTGTCATGC 
CATCAACGAT 
CCATCCGCGA 
GCCGTCAGCA 
TGTTCGGGGA 
GGTCGCAGAT 
GACAATTTTG 
CCGAAATATC 
CCTCCTCAAC 
CAACGCCACC 
TTTTGAGAAG 
ACTGA 



CCGCACCCAT 
TGGATGAGCA 
CATCGGTGCA 
CGGGTAAGGC 
GC C GATACGA 
TTCAGGCACG 
ACGGCAACAA 
GTGTTGTACG 
GCTGATTTAT 
TTGTCTATCT 
ATCAAAGGCA 
GGGCAAACTG 
ACATGTCGGC 
CGTGTTCCCT 
CACAAACATC 
AAAAAACAGC 
GGCAATGCGG 
TCTGAAACAC 
TCGAACTGGA 
GCGGCAGCCG 
TTTCCGCGCA 
GCATCTTCAA 
CATGCCGATA 
CCACAACGGC 
GAAC GTGGGA 
ACCGTCGGCA 
ACTGGGCATG 
CCGACAGCAT 
GGCTATCTCA 
CCGCAGCACC 
TGATGCAGCT 
GGAGATTT GA 
TGCATTCGCC 
CTGAAGGCAC 
AGCGATAAAG 
CGGACGCGAC 
CCGGCAAGAC 
CTGGGCGCGG 
CAGCTACGCC 
TAGGCTACCG 
GCAAACGATT 
CGACGGGAAA 
GCGGCCATAT 
ATGATTCAAC 
TTCCGATCAC 
ATTCCGATTC 
CGCATCCATT 
GCCACAGGGC 
GCTACGACAT 
AACCGCAGCA 
TATGCTGACG 
GCCCCGAGCT 
GC AGATATC G 
AGGCGATGCC 
ACGGCTTGGG 
TTGGCAGATA 
TTGGGCAGTC 
ATATCTTTAT 
AAATACGGCT 
GGGCGCGATC 
CCGATGCGGC 
CGTTCAAACT 
CGTGCCGCCG 
CGAAGACAGG 
C AC GTGAAAT 



CGTAACCGGC 
ACGACAACCT 
GTCGGCGTGG 
CATGAACGGA 
AAGGTACATC 
GGCGGCCTGA 
C AC C TAT AC G 
GCAACAACAA 
AACGGGGCGG 
GGCAGATACC 
GTCTGCAGCT 
CTGAAAGTGG 
ACGCGGCAAG 
TCCTGAGTGC 
GAAAC CGACG 
GGGCAGTGAA 
CACGGACTGC 
GCCGTAGAAC 
TGCCTCCGAA 
ACCGCACAGA 
GCGGCAGCCG 
CAGTCTCGCC 
TGCAGGGACG 
ACGGGTCTGC 
ACAGGGCGGT 
TTGCCGCGAA 
GGACGCAGCA 
TAGTCTGTTT 
AAGGCCTGTT 
GGTGCGGACG 
GGGCGCACTG 
CGGTC GAAGG 
GAAAAAGGCA 
GCTGGTCGGA 
CCGTCCTGTT 
TACACGGTAA 
GGGGGCACGC 
ATGTCGAATT 
GGTTCCAAAC 
GTTCCTCGAC 
CTTTTATCCG 
TACCACCTAT 
CGGATTGGGA 
AGGC GGC CAT 
GGGCACGAAG 
TGATGAAGCC 
GGGACGGATA 
GGCGGCTATC 
AAAAGGCGTT 
CCGGACAACG 
CAAGGAGTAG 
GG AC AG AT C G 
T T AAAAAC AT 
GTGCAGGGCA 
TCTGCTTTCC 
TGGCGCAACT 
CAAAACCCCA 
GGCAGCCATC 
TGGGCGGCAT 
GCATTGCCGA 
ATACGCCAAA 
TGGAGCAGCG 
TCAAACGGCA 
CGTACCGTTT 
ATGATACGCT 
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1 MTSAPDFNAG GTGIGSNSRA TTAKSAAVSY AGIKNEMCRD RSMLCAGRDD 

51 VAVTDRDAKI NAPPPNLHTG DFPNPNDAYK NLINLKPAIE AGYTGRGVEV 

101 GIVDTGESVG SISFPELYGR KEHGYNENYK NYTAYMRKEA PEDGGGKDIE 

151 ASFDDEAVIE TEAKPTDIRH VKEIGHIDLV SHIIGGRSVD GRPAGGIAPD 

5 201 ATLHIMNTND ETKNEMMVAA IRNAWVKLGE RGVRIVNNSF GTTSRAGTAD 

251 LFQIANSEEQ YRQALLDYSG GDKTDEGIRL MQQSDYGNLS YHIRNKNMLF 

301 IFSTGNDAQA QPNTYALLPF YEKDAQKGII TVAGVDRSGE KFKREMYGE P 

351 GTEPLEYGSN HCGITAMWCL SAPYEASVRF TRTNPIQIAG TSFSAPIVTG 

401 TAALLLQKYP WMSNDNLRT T LLTTAQDIGA VGVDSKFGWG LLDAGKAMNG 

10 451 PASFPFGDFT ADTKGTSDIA YSFRNDISGT GGLIKKGGSQ LQLHGNNTYT 

501 GKTIIEGGSL VLYGNNKSDM RVETKGALIY NGAASGGSLN SDGIVYLADT 

551 DQSGANETVH IKGSLQLDGK GTLYTRLGKL LKVDGTAIIG GKLYMSARGK 

601 GAGYLNSTGR RVPFLSAAKI GQDYSFFTNI ETDGGLLASL DSVEKTAGSE 

651 GDTLSYYVRR GNAARTASAA AHSAPAGLKH AVEQGGSNLE NLMVELDASE 

15 701 SSATPETVET AAADRTDMPG IRPYGATFRA AAAVQHANAA DGVRIFNSLA 

751 ATVYADSTAA HADMQGRRLK AVSDGLDHNG TGLRVIAQTQ QDGGTWEQGG 

801 VEGKMRGSTQ TVGIAAKTGE NTTAAATLGM GRSTWSENSA NAKTDSISLF 

851 AGIRHDAGDI GYLKGLFSYG RYKNS I SRST GADEHAEGSV NGTLMQLGAL 

901 GGVNVPFAAT GDLTVEGGLR YDLLiKQDAFA EKGSALGWSG NSLTEGTLVG 

20 951 LAGLKLSQPL SDKAVLFATA GVE RDLNGRD YTVTGGFTGA TAATGKTGAR 

1001 NMPHTRLVAG LGADVEFGNG WNGLARYSYA GSKQYGNHSG RVGVGYRFLD 

1051 GGGGTGSSDL ANDSFIRQVL DRQHFE PDGK YHLFGSRGEL AERSGHIGLG 

1101 KIQSHQLGNL MIQQAAIKGN IGYIVRFSDH GHEVHS PFDN HASHSDSDEA 

1151 GSPVDGFSLY RIHWDGYEHH PADGYDGPQG GGYPAPKGAR DIYSYDIKGV 

25 1201 AQNIRLNLTD NRSTGQRLAD RFHNAGSMLT QGVGDGFKRA TRYSPELDRS 

1251 GNAAEAFNGT ADIVKNIIGA AGE I VGAGDA VQGISEGSNI AVMHGLGBLS 

1301 TENKMARIND LADMAQLKDY AAAA I RD WAV QNPNAAQG1E AVSNIFMAAI 

1351 PIKGIGAVRG KYGLGGITAH PIKRSQMGAI ALPKGKSAVS DNFADAAYAK 

1401 YPSPYHSRNI RSNLEQRYGK ENITSSTVPP SNGKNVKLAD QRHPKTGVPF 

30 1451 DGKGFPNFEK HVKYDTLEHH HHHH* 

AG983-741 

1 ATGACTTCTG CGCCCGACTT CAATGCAGGC GGTACCGGTA TCGGCAGCAA 

35 51 CAGCAGAGCA ACAACAGCGA AATCAGCAGC AGTATCTTAC GCCGGTATCA 

101 AGAACGAAAT GTGCAAAGAC AGAAGCATGC TCTGTGCCGG TCGGGATGAC 

151 GTTGCGGTTA CAGACAGGGA TGCCAAAATC AATGCCCCCC CCCCGAATCT 

201 GCATACCGGA GACTTTCCAA AC C C AAATGA CGCATACAAG AATTTGATCA 

251 ACCTCAAACC TGCAATTGAA GCAGGCTATA CAGGACGCGG GGTAGAGGTA 

40 301 GGTATCGTCG ACACAGGCGA ATCCGTCGGC AGCATATCCT TTCCCGAACT 

351 GTATGGCAGA AAAGAAC AC G GCTATAACGA AAATTACAAA AACTATACGG 

401 CGTATATGCG GAAGGAAGCG CCTGAAGACG GAGGCGGTAA AGACATTGAA 

451 GCTTCTTTCG AC GATGAGGC CGTTATAGAG ACT GAAGC AA AGCCGACGGA 

501 TATCCGCCAC GTAAAAGAAA T C GGAC AC AT CGATTTGGTC TCCCATATTA 

45 551 TTGGCGGGCG TTCCGTGGAC GGCAGACCTG CAGGCGGTAT TGCGCCCGAT 

601 GCGACGCTAC ACATAATGAA TACGAATGAT GAAACCAAGA ACGAAATGAT 

651 GGTTGCAGCC ATCCGCAATG CATGGGTCAA GCTGGGCGAA CGTGGCGTGC 

701 GCATCGTCAA TAACAGTTTT GGAACAACAT CGAGGGCAGG CACTGCCGAC 

751 CTTTTCCAAA TAGCCAATTC GGAGGAGCAG TACCGCCAAG CGTTGCTCGA 

50 801 CTATTCCGGC GGTGATAAAA CAGACGAGGG TATCCGCCTG ATGCAACAGA 

851 GC GAT T AC GG CAACCTGTCC TACCACATCC GTAATAAAAA CATGCTTTTC 

901 ATCTTTTCGA CAGGCAATGA CGCACAAGCT CAGCCCAACA CATATGCCCT 

951 ATTGCCATTT TATGAAAAAG ACGCTCAAAA AGGCATTATC AC AGT CGC AG 

1001 GC GT AG AC CG CAGTGGAGAA AAGTTCAAAC GGGAAATGTA TGGAGAACCG 

55 1051 GGTACAGAAC C GC TTGAGTA TGGCTCCAAC CATTGCGGAA TTACTGCCAT 

1101 GTGGTGCCTG TCGGCACCCT ATGAAGCAAG CGTCCGTTTC AC C C GT AC AA 

1151 AC C CGATT C A AATTGC CGGA ACATCCTTTT CCGCACCCAT CGTAACCGGC 

1201 ACGGCGGCTC TGCTGCTGCA GAAATACCCG TGGATGAGCA ACGACAACCT 

1251 GCGTACCACG TTGCTGACGA CGGCTCAGGA CATCGGTGCA GTCGGCGTGG 

60 1301 ACAGCAAGTT CGGCTGGGGA CTGCTGGATG CGGGTAAGGC CATGAACGGA 

1351 CCCGCGTCCT TTCCGTTCGG CGACTTTACC GC C GAT AC GA AAGGTACATC 

1401 CGATATTGCC TACTCCTTCC GTAACGACAT TTCAGGCACG GGCGGCCTGA 

1451 TCAAAAAAGG CGGCAGCCAA CTGCAACTGC ACGGCAACAA C AC C TAT AC G 

1501 GGCAAAACCA TTATCGAAGG CGGTTCGCTG GTGT TGTACG GCAACAACAA 

65 1551 ATC GGATATG CGCGTCGAAA CCAAAGGTGC GCTGATTTAT AACGGGGCGG 

1601 CATCCGGCGG CAGCCTGAAC AGCGACGGCA TTGTCTATCT GGCAGATACC 

1651 GACCAATCCG GCGCAAACGA AAC C GT AC AC ATCAAAGGCA GTCTGCAGCT 
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17 01 GGACGGCAAA GGTACGCTGT ACACACGTTT GGGCAAACTG CTGAAAGTGG 

1751 ACGGTACGGC GATTATCGGC GGCAAGCTGT AC ATGTC GGC ACGCGGCAAG 

1801 GGGGCAGGCT ATCTCAACAG TACCGGACGA CGTGTTCCCT TCCTGAGTGC 

1851 CGCCAAAATC GGGCAGGATT ATTCTTTCTT CACAAACATC GAAACCGACG 

5 1901 GCGGCCTGCT GGCTTCCCTC GACAGCGTCG AAAAAACAGC GGGCAGTGAA 

1951 GGCGACACGC TGTCCTATTA TGTCCGTCGC GGCAATGCGG CACGGACTGC 

2 001 TTCGGCAGCG GCACATTCCG CGCCCGCCGG TCTGAAACAC GCCGTAGAAC 

2 051 AGGGCGGCAG CAATCTGGAA AACCTGATGG TCGAACTGGA TGCCTCCGAA 

2101 TCATCCGCAA CACCCGAGAC GGTTGAAACT GCGGCAGCCG ACCGCACAGA 

10 2151 TATGCCGGGC ATCCGCCCCT ACGGCGCAAC TTTCCGCGCA GCGGCAGCCG 

22 01 TACAGCATGC GAATGCCGCC GACGGTGTAC GCATCTTCAA CAGTCTCGCC 
2251 GCTACCGTCT ATGCCGACAG TACCGCCGCC CATGCCGATA TGCAGGGACG 

23 01 CCGCCTGAAA GCCGTATCGG ACGGGTTGGA CCACAACGGC ACGGGTCTGC 
2351 GCGTCATCGC GCAAACCCAA C AGGAC GGTG GAACGTGGGA ACAGGGCGGT 

15 2401 GTTGAAGGCA AAATGCGCGG CAGTACCCAA ACCGTCGGCA TTGCCGCGAA 

2451 AACCGGCGAA AATACGACAG CAGCCGCCAC ACTGGGCATG GGACGCAGCA 

25 01 CAT GGAGCGA AAACAGTGCA AATGCAAAAA CCGACAGCAT TAGTCTGTTT 

2 551 GCAGGCATAC GGCACGATGC GGGCGATATC GGCTATCTCA AAGGCCTGTT 

2 601 CTCCTACGGA CGCTACAAAA ACAGCATCAG CCGCAGCACC GGTGCGGACG 

20 2 651 AACATGCGGA AGGCAGCGTC AACGGCACGC TGATGCAGCT GGGCGCACTG 

2701 GGCGGTGTCA ACGTTCCGTT TGCCGCAACG GGAGATTTGA CGGTCGAAGG 

2751 CGGTCTGCGC TACGACCTGC TCAAACAGGA TGCATTCGCC GAAAAAGGCA 

2 801 GTGC TTTGGG CTGGAGCGGC AACAGCCTCA C TGAAGGC AC GCTGGTCGGA 

2 851 CTCGCGGGTC TGAAGCTGTC GCAACCCTTG AGCGATAAAG CCGTCCTGTT 
25 2901 TGCAACGGCG GGCGTGGAAC GCGACCTGAA CGGACGCGAC TACACGGTAA 

2951 CGGGCGGCTT TACCGGCGCG ACTGCAGCAA CCGGCAAGAC GGGGGCACGC 

3 001 AATATGCCGC ACACCCGTCT GGTTGCCGGC CTGGGCGCGG ATGTCGAATT 
3 051 CGGCAACGGC TGGAACGGCT TGGCACGTTA CAGCTACGCC GGTTCCAAAC 
3101 AGTACGGCAA CCACAGCGGA CGAGTCGGCG TAGGCTACCG GTTCCTCGAG 

30 3151 GGATCCGGAG GGGGTGGTGT CGCCGCCGAC ATCGGTGCGG GGCTTGC CGA 

3201 TGCACTAACC GCACCGCTCG ACCATAAAGA CAAAGGTTTG CAGTCTTTGA 

3251 CGCTGGATCA GTCCGTCAGG AAAAAC GAGA AACTGAAGCT GGCGGCACAA 

33 01 GGTGCGGAAA AAACTTATGG AAACGGTGAC AGCCTCAATA CGGGCAAATT 

3 351 G AAG AAC G AC AAGGTCAGCC GTTTCGACTT TATCCGCCAA ATCGAAGTGG 

35 3401 ACGGGCAGCT CATTACCTTG GAGAGTGGAG AGTTCCAAGT ATACAAACAA 

3451 AGCCATTCCG CCTTAACCGC C T T TC AG AC C GAGCAAATAC AAGATTCGGA 

35 01 GCATTCCGGG AAGATGGTTG CGAAACGCCA GTTCAGAATC GGCGACATAG 

3 551 CGGGCGAACA TACATCTTTT GACAAGCTTC CCGAAGGCGG CAGGGCGACA 

3601 TATCGCGGGA CGGCGTTCGG TTCAGACGAT GCCGGCGGAA AACTGACCTA 

40 3651 CACCATAGAT TTCGCCGCCA AGCAGGGAAA CGGCAAAATC GAACATTTGA 

3701 AATCGCCAGA ACTCAATGTC GACCTGGCCG C C GC C GAT AT CAAGCCGGAT 

37 51 GGAAAACGCC ATGCCGTCAT CAGCGGTTCC GTCCTTTACA ACCAAGCCGA 

3 801 GAAAGGCAGT TACTCCCTCG GTATCTTTGG CGGAAAAGCC CAGGAAGTTG 

3851 CCGGCAGCGC GGAAGTGAAA ACCGTAAACG GCATACGCCA TATCGGCCTT 

45 3901 GCCGCCAAGC AACTCGAGCA CCACCACCAC CACCACTGA 

1 MT S APDFNAG GTGIGSNSRA TTAKSAAVSY AGIKNEMCKD RSMLCAGRDD 

51 VAVTDRDAKI NAP P PNLHTG DFPNPNDAYK NLINLKPAIE AGYTGRGVEV 

101 GIVDTGESVG SISFPELYGR KEHGYNENYK NYTAYMRKEA PEDGGGKDIE 

50 151 AS F DDE AVI E TEAKPTDIRH VKEIGHIDLV SHIIGGRSVD GRPAGG I APD 

201 ATLHIMNTND ETKNEMMVAA IRNAWVKLGE RGVRI V3KTN S F GTTSRAGTAD 

251 LFQIANSEEQ YRQALDDYSG GDKTDEGIRL MQQSDYGNLS YHIRNKNMLF 

301 I F S TGNDAQ A QPNTYALLPF YEKDAQKGII TVAGVDRSGE KFKREMYGE P 

351 GTEPLEYGSN HCGITAMWCL SAP YEA S VRF TRTNPIQIAG TSFSAPIVTG 

55 401 TAALLLQKYP WMSNDNLRTT LLTTAQDIGA VGVDSKFGWG LLDAGKAMNG 

451 PASFPFGDFT ADTKGTSDIA YSFRNDISGT GGLIKKGGSQ LQLHGNNTYT 

501 GKTIIEGGSL VLYGNNKSDM RVETKGAL I Y NGAASGGSLN SDGIVYLADT 

551 DQSGANETVH IKGSLQLDGK GTLYTRLGKL LKVDGTAIIG GKLYMSARGK 

601 GAGYLNSTGR RVPFLSAAKI GQDYSFFTNI ETDGGLLASL DSVEKTAGSE 

60 651 GDTLSYYVRR GNAARTASAA AHSAPAGLKH AVEQGGSNLE NLMVELDASE 

701 SSATPETVET AAADRTDMPG I RP YGATFRA AAAVQHANAA DGVRIFNSLA 

751 ATVYADSTAA HADMQGRRLK AVSDGLDHNG TGLRVIAQTQ QDGGTWEQGG 

801 VEGKMRGSTQ TVGIAAKTGE NTTAAATLGM GRSTWSENSA NAKTDSISLF 

851 AGIRHDAGDI GYLKGLFSYG RYKNSISRST GADEHAEGSV NGTLMQLGAL 

65 901 GGVNVPFAAT GDLTVEGGLR YDLLKQDAFA EKGSALGWSG NSLTEGTLVG 

951 LAGLKDSQPL SDKAVIjFATA GVERDLNGRD YTVTGGFTGA TAATGKTGAR 

1001 NMPHTRLVAG LGADVEFGNG WNGLARYSYA GSKQYGNHSG RVGVGYRFLE 
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1051 
1101 
1151 
1201 
1251 
1301 



GSGGGGVAAD 
GAEKTYGNGD 
SHSALTAFQT 
YRGTAFGSDD 
GKRHAVISGS 
AAKQLEHHHH 



IGAGLADALT 
SLNTGKLKND 
EQIQDSEHSG 
AGGKLTYTID 
VLYNQAEKGS 
HH* 



APLDHKDKGL 
KVSRFDFIRQ 
KMVAKRQFR I 
FAAKQGNGKI 
YSLGIFGGKA 



QSLTLDQSVR 
IEVDGQLITIj 
GDIAGEHTSF 
EHLKSPELNV 
QEVAGSAEVK 



KNEKLKLAAQ 
ESGEFQVYKQ 
DKL PEGGRAT 
DLAAADIKPD 
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AG983-961 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 



ATGACTTCTG 
CAGCAGAGCA 
AGAACGAAAT 
GTTGCGGTTA 
GCATACCGGA 
ACCTCAAACC 
GGTATCGTCG 
GTATGGCAGA 
CGTATATGCG 
GCTTCTTTCG 
TATCCGCCAC 
TTGGCGGGCG 
GCGACGCTAC 
GGTTGCAGCC 
GCATCGTCAA 
CTTTTCCAAA 
CTATTCCGGC 
GC GAT TAG GG 
ATCTTTTCGA 
ATTGCCATTT 
GCGTAGACCG 
GGTACAGAAC 
GTGGTGCCTG 
ACCCGATTCA 
ACGGCGGCTC 
GCGTACCACG 
ACAGCAAGTT 
CCCGCGTCCT 
CGATATTGCC 
TCAAAAAAGG 
GGCAAAACCA 
ATCGGATATG 
CATCCGGCGG 
GACCAATCCG 
GGACGGCAAA 
ACGGTACGGC 
GGGGCAGGCT 
CGCCAAAATC 
GCGGCCTGCT 
GGCGACACGC 
TTCGGCAGCG 
AGGGCGGCAG 
TCATCCGCAA 
TATGCCGGGC 
TACAGCATGC 
GCTACCGTCT 
CCGCCTGAAA 
GCGTCATCGC 
GTTGAAGGCA 
AACCGGCGAA 
CAT GGAGC GA 
GCAGGCATAC 
CTCCTACGGA 
AACATGCGGA 
GGCGGTGTCA 
CGGTCTGCGC 
GTGCTTTGGG 
CTCGCGGGTC 



CGCCCGACTT 
ACAACAGCGA 
GTGCAAAGAC 
CAGACAGGGA 
GACTTTCCAA 
TGCAATTGAA 
ACACAGGCGA 
AAAGAAC AC G 
GAAGGAAGCG 
AC GAT GAGGC 
GTAAAAGAAA 
TTCCGTGGAC 
ACATAATGAA 
ATCCGCAATG 
TAACAGTTTT 
TAGCCAATTC 
GGTGATAAAA 
CAACCTGTCC 
CAGGCAATGA 
TATGAAAAAG 
CAGTGGAGAA 
CGCTTGAGTA 
TCGGCACCCT 
AATTGCCGGA 
TGCTGCTGCA 
TTGCTGACGA 
CGGCTGGGGA 
TTCCGTTCGG 
TACTCCTTCC 
CGGCAGCCAA 
TTATCGAAGG 
CGCGTCGAAA 
CAGCCTGAAC 
GCGCAAACGA 
GGTACGCTGT 
GATTATCGGC 
ATCTCAACAG 
GGGCAGGATT 
GGCTTCCCTC 
TGTCCTATTA 
GCACATTCCG 
CAATCTGGAA 
CACCCGAGAC 
ATCCGCCCCT 
GAATGCCGCC 
ATGCCGACAG 
GCCGTATCGG 
GCAAACCCAA 
AAATGCGCGG 
AAT AC G AC AG 
AAACAGTGCA 
GGCACGATGC 
CGCTACAAAA 
AGGCAGCGTC 
ACGTTCCGTT 
TACGACCTGC 
CTGGAGCGGC 
TGAAGCTGTC 



CAATGCAGGC 
AATCAGCAGC 
AGAAGCATGC 
TGCCAAAATC 
ACCCAAATGA 
GCAGGCTATA 
ATCCGTCGGC 
GCTATAACGA 
CCTGAAGACG 
CGTTATAGAG 
TCGGACACAT 
GGCAGACCTG 
TACGAATGAT 
CATGGGTCAA 
GGAACAACAT 
GGAGGAGCAG 
CAGACGAGGG 
TACCACATCC 
CGCACAAGCT 
ACGCTCAAAA 
AAGTTCAAAC 
TGGCTCCAAC 
ATGAAGCAAG 
ACATCCTTTT 
GAAATACCCG 
CGGCTCAGGA 
CTGCTGGATG 
CGACTTTACC 
GTAACGACAT 
CTGCAACTGC 
CGGTTCGCTG 
CCAAAGGTGC 
AGCGACGGCA 
AAC C GT AC AC 
ACACACGTTT 
GGCAAGCTGT 
TACCGGACGA 
ATTCTTTCTT 
GACAGCGTCG 
TGTCCGTCGC 
CGCCCGCCGG 
AACCTGATGG 
GGTTGAAACT 
ACGGCGCAAC 
GAC GGTGTAC 
TACCGCCGCC 
ACGGGTTGGA 
CAGGACGGTG 
CAGTACCCAA 
CAGCCGCCAC 
AATGCAAAAA 
GGGC GATATC 
ACAGCATCAG 
AACGGCACGC 
TGCCGCAACG 
TCAAACAGGA 
AACAGCCTCA 
GCAACCCTTG 



GGTACCGGTA 
AGTATCTTAC 
TCTGTGCCGG 
AATGCCCCCC 
CGCATACAAG 
CAGGACGCGG 
AGCATATCCT 
AAATTACAAA 
GAGGCGGTAA 
ACTGAAGCAA 
CGATTTGGTC 
CAGGCGGTAT 
GAAACCAAGA 
GCTGGGCGAA 
CGAGGGCAGG 
TACCGCCAAG 
TATCCGCCTG 
GTAATAAAAA 
CAGCCCAACA 
AGGCATTATC 
GGGAAATGTA 
CATTGCGGAA 
CGTCCGTTTC 
CCGCACCCAT 
TGGATGAGCA 
CATCGGTGCA 
CGGGTAAGGC 
GCCGATACGA 
TTCAGGCACG 
ACGGCAACAA 
GTGTTGTACG 
GCTGATTTAT 
TTGTCTATCT 
ATCAAAGGCA 
GGGCAAACTG 
ACATGTCGGC 
CGTGTTCCCT 
CACAAACATC 
AAAAAACAGC 
GGCAATGCGG 
TCTGAAACAC 
TCGAACTGGA 
GCGGCAGCCG 
TTTCCGCGCA 
GCATCTTCAA 
CATGCCGATA 
CCACAACGGC 
GAACGTGGGA 
AC CGT CGGC A 
ACTGGGCATG 
CCGACAGCAT 
GGCTATCTCA 
CCGCAGCACC 
TGATGCAGCT 
GGAGATTTGA 
TGCATTCGCC 
CTGAAGGCAC 
AGCGATAAAG 



TCGGCAGCAA 
GCCGGTATCA 
TCGGGATGAC 
CCCCGAATCT 
AATTTGATCA 
GGTAGAGGTA 
TTCCCGAACT 
AACT AT AC GG 
AGACATTGAA 
AGCCGACGGA 
T C C CAT ATT A 
TGCGCCCGAT 
AC GAAATGAT 
CGTGGCGTGC 
CACTGCCGAC 
CGTTGCTCGA 
AT GC AAC AGA 
CATGCTTTTC 
CATATGCCCT 
ACAGTCGCAG 
TGGAGAACCG 
TTACTGCCAT 
AC C C GT AC AA 
CGTAACCGGC 
ACGACAACCT 
GTCGGCGTGG 
CATGAACGGA 
AAGGT AC AT C 
GGCGGCCTGA 
C AC C TAT AC G 
GCAACAACAA 
AACGGGGCGG 
GGC AG AT AC C 
GTCTGCAGCT 
CTGAAAGTGG 
ACGCGGCAAG 
TCCTGAGTGC 
GAAAC CGAC G 
GGGCAGTGAA 
CACGGACTGC 
GCCGTAGAAC 
TGCCTCCGAA 
AC C GC AC AGA 
GCGGCAGCCG 
CAGTCTCGCC 
TGCAGGGACG 
ACGGGTCTGC 
ACAGGGCGGT 
TTGCCGCGAA 
GGACGCAGCA 
TAGTCTGTTT 
AAGGC CTGTT 
GGTGCGGACG 
GGGCGCACTG 
CGGTCGAAGG 
GAAAAAGGCA 
GCTGGTCGGA 
CCGTCCTGTT 
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2951 
3001 
3051 
3101 
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3651 
3701 
3751 
3801 
3851 
3901 
3951 
4001 
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4151 
4201 
4251 
4301 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 



TGCAACGGCG 
CGGGCGGCTT 
AATATGCCGC 
CGGCAACGGC 
AGTACGGCAA 
GGTGGCGGAG 
TGCCACTGTG 
GTTTCAAAGC 
ACCAAAAAAG 
TCTGGGTCTG 
ACAAACAAAA 
AAGTTAACAA 
TGCCGCTCTG 
TAACGACATT 
AAATTAGAAG 
CGATATCGCC 
TCAAAACCGC 
GTCGATGCCA 
TGCCGCTGGC 
CAAAAGTTAC 
GCTAAAAAAG 
CAAATTTGTC 
CACGCTTGGC 
AAC GGTTTGG 
CCTTGCAGAA 
GTCGGTTCAA 
GTCGCCATCG 
AGGCGTGGCA 
GCGTCAATTA 

MTSAPDFNAG 
VAVTDRDAK I 
GIVDTGESVG 
ASFDDEAVIE 
ATLHIMNTND 
LFQIANSEEQ 
IFSTGNDAQA 
GTEPLEYGSN 
TAALLLQKYP 
PASFPFGDFT 
GKTIIEGGSL 
DQSGANETVH 
GAGYLNSTGR 
GDTLSYYVRR 
SSATPETVET 
AT VY AD S T AA 
VEGKMRGSTQ 
AGIRHDAGDI 
GGVNVPFAAT 
LAGLKLSQPL 
MMPHTRLVAG 
GGGGTGSATN 
TKKDATAADV 
KLTTKLADTD 
KLEAVADTVD 
VDAKVKAAET 
AKKANSADVY 
NGLDKTVSDL 
VAIGTGFRFT 



GGCGTGGAAC 
TACCGGCGCG 
ACACCCGTCT 
TGGAACGGCT 
C C AC AGCGGA 
GCACTGGATC 
GCCATTGCTG 
TGGAGAG AC C 
ACGCAACTGC 
AAAAAAGTCG 
CGTCGATGCC 
CCAAGTTAGC 
GATGC AAC C A 
TGC TGAAGAG 
CCGTGGCTGA 
GATTCATTGG 
CAATGAAGCC 
AAGTAAAAGC 
ACAGCTAATA 
C GAC ATC AAA 
CAAACAGTGC 
AGAATTGATG 
TTCTGCTGAA 
ATAAAACAGT 
CAAGCCGCGC 
TGTAACGGCT 
GTACCGGCTT 
GTCGGCACTT 
CGAGTGGCTC 

GTGIGSNSRA 
NAPPPNLHTG 
S I SFPELYGR 
TEAKPTDIRH 
ETKNEMMVAA 
YRQALLDYSG 
QPNTYALLPF 
HCGITAMWCL 
WMSNDNLRTT 
ADTKGTSDIA 
VLYGNNKSDM 
IKGSLQLDGK 
RVPFLSAAKI 
GNAARTASAA 
AAADRT DMPG 
HADMQGRRLK 
TVGIAAKTGE 
GYLKGLFSYG 
GDLTVEGGLR 
S DKAVLF ATA 
LGADVEFGNG 
DDDVKKAATV 
EADDFKGLGL 
AALADTDAAL 
KHAEAFND I A 
AAGKAEAAAG 
TREESDSKFV 
RKETRQGLAE 
ENFAAKAGVA 



GCGACCTGAA 
ACTGCAGCAA 
GGTTGCCGGC 
TGGCACGTTA 
CGAGTCGGCG 
CGCCACAAAC 
C TGC C T AC AA 
AT CT ACGAC A 
AGCCGATGTT 
TGACTAACCT 
AAAGTAAAAG 
AGACACTGAT 
CCAACGCCTT 
ACTAAGACAA 
TACCGTCGAC 
ATGAAAC C AA 
AAACAGACGG 
TGCAGAAACT 
CTGCAGCCGA 
GCTGATATCG 
CGACGTGTAC 
GTCTGAACGC 
AAATCCATTG 
GTCAGACCTG 
TCTCCGGTCT 
GCAGTCGGCG 
CCGCTTTACC 
CGTCCGGTTC 
GAGCACCACC 

TTAKSAAVSY 
DFPNPNDAYK 
KEHGYNENYK 
VKEIGHIDLV 
IRNAWVKLGE 
GDKTDEGIRL 
YEKDAQKGII 
SAPYEASVRF 
LLTTAQDIGA 
YSFRNDISGT 
RVETKGALIY 
GTLYTRLGKL 
GQDYSFFTNI 
AHSAPAGLKH 
IRPYGATFRA 
AVSDGLDHNG 
NTTAAATLGM 

RYKNS I SRST 
YDLLKQDAFA 
GVERDLNGRD 
WNGLARYSYA 
A I AAAYNNGQ 
KKWTNLTKT 
DATTNALNKL 
DSLDETNTKA 
TANTAADKAE 
RIDGLNATTE 
QAALSGLFQP 
VGTSSGSSAA 



CGGACGCGAC 
CCGGCAAGAC 
CTGGGCGCGG 
CAGCTACGCC 
TAGGCTACCG 
GAC GACGATG 
CAATGGCCAA 
TTGATGAAGA 
GAAGCCGACG 
GACCAAAACC 
CTGCAGAATC 
GCCGCTTTAG 
GAATAAATTG 
AT AT C GT AAA 
AAGCATGCCG 
CACTAAGGCA 
CCGAAGAAAC 
GCAGCAGGCA 
CAAGGCCGAA 
CTACGAACAA 
AC C AG AG AAG 
T AC T AC C G AA 
CCGATCACGA 
CGCAAAGAAA 
GTTCCAACCT 
GCTACAAATC 
GAAAACTTTG 
TTCCGCAGCC 
ACCACCACCA 

AGIKNEMCKD 
NLINLKPAIE 
NYTAYMRKEA 
SHIIGGRSVD 
RGVRIVNNSF 
MQQSDYGNLS 
WAGVDRSGE 
TRTNP I Q I AG 
VGVDSKFGWG 
GGLIKKGGSQ 
NGAASGGSLN 
LKVDGTAIIG 
ETDGGLLASL 
AVEQGGSNLE 
AAAVQHANAA 
TGLRVIAQTQ 
GRSTWSENSA 
GADEHAEGSV 
EKGSALGWSG 
YTVTGGFTGA 
GSKQYGNHSG 
EINGFKAGET 
WENKQNVDA 
GENITTFAEE 
DEAVKTANEA 
AVAAKVTDIK 
KLDTRLASAE 
YNVGRFNVTA 
YHVGWYEWL 



TACACGGTAA 
GGGGGCACGC 
ATGTCGAATT 
GGTTCCAAAC 
GTTCCTCGAG 
TTAAAAAAGC 
GAAATC AAC G 
CGGCACAATT 
ACTTTAAAGG 
GTCAATGAAA 
TGAAATAGAA 
CAGATACTGA 
GGAGAAAATA 
AATTGATGAA 
AAGCATTCAA 
GACGAAGCCG 
CAAACAAAAC 
AAGC CGAAGC 
GCTGTCGCTG 
AGATAATATT 
AGTCTGACAG 
AAATTGGACA 
TACTCGCCTG 
CCCGCCAAGG 
TACAACGTGG 
CGAATCGGCA 
CCGCCAAAGC 
TACCATGTCG 
CTGA 

RSMLCAGRDD 
AGYTGRGVEV 
PEDGGGKDIE 
GRPAGGIAPD 
GTTSRAGTAD 
YHIRNKNMLF 
KFKREMYGEP 
TSFSAPIVTG 
LLDAGKAMNG 
LQLHGNNTYT 
SDGIVYLADT 
GKLYMSARGK 
DSVEKTAGSE 
NLMVELDASE 
DGVRIFNSLA 
QDGGTWEQGG 
NAKTDSISLF 
NGTLjMQLGAL 
NSLTEGTLVG 
TAATGKTGAR 
RVGVGYRFLE 
I YDI DEDGT I 
KVKAAESEIE 
TKTNIVKIDE 
KQTAEETKQN 
ADIATNKDNI 
KSIADHDTRL 
AVGGYKSESA 
EHHHHHH* 
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AG983-961C 

1 ATGACTTCTG CGCCCGACTT CAATGCAGGC GGTACCGGTA TCGGCAGCAA 
51 CAGCAGAGCA ACAACAGCGA AATCAGCAGC AGTATC TTAC GCCGGTATCA 
101 AG AAC GAAAT GTGCAAAGAC AGAAGCATGC TCTGTGCCGG TCGGGATGAC 
151 GTTGCGGTTA CAGACAGGGA TGCCAAAATC AATGCCCCCC CCCCGAATCT 
2 01 GCATACCGGA GACTTTCCAA ACCCAAATGA CGCATACAAG AATTTGATCA 
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251 


ACCTCAAACC 


301 


GGTATCGTCG 


351 


GTATGGCAGA 


401 


CGTATATGCG 


451 


GCTTCTTTCG 


501 


TATCCGCCAC 


551 


TTGGCGGGCG 


601 


GCGACGCTAC 


651 


GGTTGCAGCC 


701 


GCATCGTCAA 


751 


CTTTTCCAAA 


801 


CTATTCCGGC 


851 


GCGATTACGG 


901 


ATCTTTTCGA 


951 


ATTGCCATTT 


1001 


GCGTAGACCG 


1051 


GGTACAGAAC 


1101 


GTGGTGCCTG 


1151 


ACCCGATTCA 


1201 


ACGGCGGCTC 


1251 


GCGTACCACG 


1301 


ACAGCAAGTT 


1351 


CCCGCGTCCT 


1401 


CGATATTGCC 


1451 


TCAAAAAAGG 


1501 


GGCAAAACCA 


1551 


ATCGGATATG 


1601 


CATCCGGCGG 


1651 


GACCAATCCG 


1701 


GGACGGCAAA 


1751 


ACGGTACGGC 


1801 


GGGGCAGGCT 


1851 


CGCCAAAATC 


1901 


GCGGCCTGCT 


1951 


GGCGACACGC 


2001 


TTCGGCAGCG 


2051 


AGGGCGGCAG 


2101 


TCATCCGCAA 


2151 


TATGCCGGGC 


2201 


TACAGCATGC 


2251 


GCTACCGTCT 


2301 


CCGCCTGAAA 


2351 


GCGTCATCGC 


2401 


GTTGAAGGCA 


2451 


AACCGGCGAA 


2501 


CATGGAGCGA 


2551 


GCAGGCATAC 


2601 


CTCCTACGGA 


2651 


AACATGCGGA 


2701 


GGCGGTGTCA 


2751 


CGGTCTGCGC 


2801 


GTGCTTTGGG 


2851 


CTCGCGGGTC 


2901 


TGCAACGGCG 


2951 


CGGGCGGCTT 


3001 


AATATGCCGC 


3051 


CGGCAACGGC 


3101 


AGTACGGCAA 


3151 


GGTGGC GGAG 


3201 


TGCCACTGTG 


3251 


GTTTCAAAGC 


3301 


ACCAAAAAAG 


3351 


TCTGGGTCTG 


3401 


ACAAACAAAA 


3451 


AAGTTAACAA 


3501 


TGCCGCTCTG 


3551 


TAACGAC AT T 



TGCAATTGAA 
ACACAGGCGA 
AAAGAAC AC G 
GAAGGAAGC G 
ACGATGAGGC 
GTAAAAGAAA 
TTCCGTGGAC 
ACATAATGAA 
ATCCGCAATG 
TAACAGTTTT 
TAGCCAATTC 
GGTGATAAAA 
CAACCTGTCC 
CAGGCAATGA 
TATGAAAAAG 
CAGTGGAGAA 
CGCTTGAGTA 
TCGGCACCCT 
AATTGCCGGA 
TGCTGCTGCA 
TTGCTGACGA 
CGGCTGGGGA 
TTCCGTTCGG 
TACTCCTTCC 
CGGCAGCCAA 
TTATCGAAGG 
CGCGTCGAAA 
CAGCCTGAAC 
GCGCAAACGA 
GGTACGCTGT 
GATTATCGGC 
ATCTCAACAG 
GGGCAGGATT 
GGCTTCCCTC 
TGTCCTATTA 
GCACATTCCG 
CAATCTGGAA 
CACCCGAGAC 
ATCCGCCCCT 
GAATGCCGCC 
ATGCCGACAG 
GCCGTATCGG 
GCAAACCCAA 
AAATGCGCGG 
AAT AC GAG AG 
AAACAGTGCA 
GGCACGATGC 
CGCTACAAAA 
AGGCAGCGTC 
ACGTTCCGTT 
TACGACCTGC 
CTGGAGCGGC 
TGAAGCTGTC 
GGCGTGGAAC 
TACCGGCGCG 
ACACCCGTCT 
TGGAACGGCT 
CC AC AGCGGA 
GCACTGGATC 
GCCATTGCTG 
TGGAGAGACC 
ACGCAACTGC 
AAAAAAGTCG 
CGTCGATGCC 
CCAAGTTAGC 
GATGCAACCA 
TGC TGAAGAG 



GCAGGCTATA 
ATCCGTCGGC 
GCTATAACGA 
CCTGAAGACG 
CGTTATAGAG 
TCGGACACAT 
GGCAGACCTG 
TACGAATGAT 
CATGGGTCAA 
GGAACAACAT 
GGAGGAGCAG 
C AG AC GAGGG 
TACCACATCC 
CGCACAAGCT 
ACGCTCAAAA 
AAGTTCAAAC 
TGGCTCCAAC 
ATGAAGCAAG 
ACATCCTTTT 
GAAATACCCG 
CGGCTCAGGA 
CTGCTGGATG 
CGACTTTACC 
GTAACGACAT 
CTGCAACTGC 
CGGTTCGCTG 
CCAAAGGTGC 
AGC GACGGC A 
AAC C GT AC AC 
AC AC AC GT TT 
GGCAAGCTGT 
T AC CGGACGA 
ATTCTTTCTT 
GACAGCGTCG 
TGTCCGTCGC 
CGCCCGCCGG 
AAC CTGATGG 
GGTTGAAACT 
ACGGCGCAAC 
GACGGTGTAC 
TACCGCCGCC 
ACGGGTTGGA 
CAGGACGGTG 
CAGTACCCAA 
CAGCCGCCAC 
AATGCAAAAA 
GGGCGATATC 
AC AGC AT C AG 
AACGGCACGC 
TGCCGCAACG 
TCAAACAGGA 
AACAGCCTCA 
GCAACCCTTG 
GCGACCTGAA 
AC TGC AGC AA 
GGTTGCCGGC 
TGGCACGTTA 
CGAGTCGGCG 
CGCCACAAAC 
CTGCCTACAA 
ATCTACGACA 
AGCCGATGTT 
TGACTAACCT 
AAAGTAAAAG 
AGACACTGAT 
CCAACGCCTT 
ACTAAGACAA 



CAGGACGCGG 
AGCATATCCT 
AAATTACAAA 
GAGGCGGTAA 
ACTGAAGCAA 
CGATTTGGTC 
CAGGCGGTAT 
GAAACCAAGA 
GCTGGGCGAA 
CGAGGGCAGG 
TACCGCCAAG 
TATCCGCCTG 
GTAATAAAAA 
CAGCCCAACA 
AGGCATTATC 
GGGAAATGTA 
CATTGCGGAA 
CGTCCGTTTC 
CCGCACCCAT 
TGGATGAGCA 
CATCGGTGCA 
CGGGTAAGGC 
GC C GAT AC GA 
TTCAGGCACG 
ACGGCAACAA 
GTGTTGTACG 
GCTGATTTAT 
TTGTCTATCT 
AT C AAAGGC A 
GGGCAAACTG 
ACATGTCGGC 
CGTGTTCCCT 
CACAAACATC 
AAAAAACAGC 
GGCAATGCGG 
TCTGAAACAC 
TCGAACTGGA 
GCGGCAGCCG 
TTTCCGCGCA 
GCATCTTCAA 
CATGCCGATA 
CCACAACGGC 
GAAC GTGGGA 
ACCGTCGGCA 
AC TGGGC ATG 
C CGAC AGC AT 
GGCTATCTCA 
CCGCAGCACC 
TGATGCAGCT 
GGAGATTTGA 
TGCATTCGCC 
CTGAAGGCAC 
AGC GAT AAAG 
CGGACGCGAC 
CCGGCAAGAC 
CTGGGCGCGG 
CAGCTACGCC 
TAGGCTACCG 
GACGACGATG 
CAATGGCCAA 
TTGATGAAGA 
GAAGCCGACG 
GACCAAAACC 
CTGCAGAATC 
GCCGCTTTAG 
GAATAAATTG 
ATATCGTAAA 



GGTAGAGGTA 
TTCCCGAACT 
AACTATACGG 
AGACATTGAA 
AGCCGACGGA 
T C CC AT ATT A 
TGCGCCCGAT 
AC GAAATGAT 
CGTGGCGTGC 
CACTGCCGAC 
CGTTGCTCGA 
ATGCAACAGA 
CATGCTTTTC 
CATATGCCCT 
ACAGTCGCAG 
TGGAGAACCG 
TTACTGCCAT 
ACCCGTACAA 
CGTAACCGGC 
AC G AC AAC C T 
GTCGGCGTGG 
CATGAACGGA 
AAGGTAC AT C 
GGCGGCCTGA 
CACCTATACG 
GCAACAACAA 
AACGGGGCGG 
GGCAGATACC 
GTCTGCAGCT 
CTGAAAGTGG 
ACGCGGCAAG 
TCCTGAGTGC 
GAAAC CGAC G 
GGGCAGTGAA 
C ACGGAC TGC 
GCCGTAGAAC 
TGCCTCCGAA 
ACCGCACAGA 
GCGGCAGCCG 
CAGTCTCGCC 
TGC AGGGAC G 
ACGGGTCTGC 
ACAGGGCGGT 
TTGCCGCGAA 
GGACGCAGCA 
TAGTCTGTTT 
AAGGCCTGTT 
GGTGCGGACG 
GGGCGCACTG 
CGGTCGAAGG 
GAAAAAGGCA 
GCTGGTCGGA 
CCGTCCTGTT 
TACACGGTAA 
GGGGGCACGC 
ATGTCGAATT 
GGTTCCAAAC 
GTTCCTCGAG 
TTAAAAAAGC 
GAAATC AAC G 
CGGCACAATT 
ACTTTAAAGG 
GT CAATGAAA 
TGAAATAGAA 
CAGATACTGA 
GGAGAAAATA 
AATTGATGAA 
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3 6 01 AAATTAGAAG CCGTGGCTGA TACCGTCGAC AAGCATGCCG AAGCATTCAA 

3 651 CGATATCGCC GATTCATTGG ATGAAACCAA C AC T AAGGC A GACGAAGCCG 

37 01 TCAAAACCGC CAATGAAGCC AAACAGACGG CCGAAGAAAC CAAACAAAAC 

3751 GTCGATGCCA AAGTAAAAGC TGCAGAAACT GCAGCAGGCA AAGC CGAAGC 

3 8 01 TGCCGCTGGC ACAGCTAATA CTGCAGCCGA CAAGGCCGAA GCTGTCGCTG 

3 851 C AAAAGT T AC CGACATCAAA GCTGATATCG CTACGAACAA AGATAATATT 

3901 GCTAAAAAAG CAAACAGTGC CGACGTGTAC AC C AG AG AAG AGTCTGACAG 

3 951 CAAATTTGTC AGAATTGATG GTCTGAACGC TACTACCGAA AAATT GGAC A 

4001 CACGCTTGGC TTCTGCTGAA AAATCCATTG CCGATCACGA TACTCGCCTG 

4051 AACGGTTTGG ATAAAACAGT GTCAGACCTG CGCAAAGAAA CCCGCCAAGG 

4101 CCTTGCAGAA CAAGCCGCGC TCTCCGGTCT GTTCCAACCT TACAACGTGG 

4151 GTCTCGAGCA CCACCACCAC CACCACTGA 

1 MT SAP DFNAG GTGIGSNSRA TTAKSAAVSY AGIKNEMCKD RSMLCAGRDD 

51 VAVTDRDAKI NAPPPNLHTG DFPNPNDAYK NLINLKPAIE AGYTGRGVEV 

101 GIVDTGESVG S I SFPELYGR KEHGYNENYK NYTAYMRKEA PEDGGGKDXE 

151 ASFDDEAVIE TEAKPTDIRH VKEIGHIDLV SHIIGGRSVD GRPAGGIAPD 

2 01 ATLHIMNTND ETKNEMMVAA I RNAWVKLGE RGVRIVNNSF GTTSRAGTAD 

2 51 LFQIANSEEQ YRQALLDYSG GDKTDEGIRL MQQSDYGNLS YHIRNKNMLF 

3 01 IFSTGNDAQA QPNTYALLPF YEKDAQKG I I TVAGVDRSGE KFKREMYGEP 
3 51 GTEPLEYGSN HCGITAMWCL SAPYEASVRF TRTNPIQIAG TSFSAPIVTG 
401 TAALLLQKYP WMSNDNLRTT LLTTAQDIGA VGVD SKFGWG LLDAGKAMNG 
451 PASFPFGDFT ADTKGT SD I A YSFRNDISGT GGLIKKGGSQ LQLHGNNTYT 
501 GKTIIEGGSL VLYGNNKSDM RVETKGALIY NGAASGGSLN SDGIVYLADT 
551 DQSGANETVH IKGSLQLDGK GTLYTRLGKL LKVDGTAIIG GKLYMSARGK 
601 GAGYLNSTGR RVPFLSAAKI GQDYSFFTUI ETDGGLLASL DSVEKTAGSE 
651 GDTLSYYVRR GNAARTASAA AHSAPAGLKH AVEQGGSNLE NLMVELDASE 
701 SSATPETVET AAADRTDMPG IRPYGATFRA AAAVQHANAA DGVRIFNSLA 
751 AT VY AD S T AA HADMQGRRLK AVSDGLDHNG TGLRVIAQTQ QDGGTWEQGG 

8 01 VEGKMRGSTQ TVGIAAKTGE NTTAAATLGM GRSTWSENSA NAKTDSISLF 
851 AG I RHDAGD I GYLKGLFSYG RYKNS I SRST GADEHAEGSV NGTLMQLGAL, 

9 01 GGVWPFAAT GDLTVEGGLR YDLLKQDAFA EKGSALGWSG NSLTEGTLVG 
951 LAGL.KLSQPL S DKAVLFAT A GVERDLNGRD YTVTGGFTGA TAATGKTGAR 

1001 NMPHTRLVAG LGADVEFGNG WNGLARYSYA GSKQYGNHSG RVGVGYRFLE 

1051 GGGGTGSATN DDDVKKAATV AIAAAYNNGQ EINGFKAGET IYDIDEDGTI 

1101 TKKDATAADV EADDFKGLGL KKWTNLTKT TOENKQNVDA KVKAAESEIE 

1151 KLTTKLADTD AALADTDAAL DATTNALNKL GENITTFAEE TKTNIVKIDE 

12 01 KLEAVADTVD KHAEAFNDIA DSLDETNTKA DEAVKTANEA KQTAEETKQN 

12 51 VDAKVKAAET AAGKAEAAAG TANTAADKAE AVAAKVTDIK ADIATNKDNI 

13 01 AKKANSADVY TREESDSKFV RIDGLNATTE KLDTRLASAE KSIADHDTRL 
1351 NGLDKTVSDL RKETRQGLAE QAALSGLFQP YNVGL EHHHH HH* 

AG741 and hybrids 

Bactericidal titres generated in response to AG741 (His-fusion) were measured against 
various strains, including the homologous 2996 strain: 





2996 


MC58 


NGH38 


F6124 


BZ133 


AG741 


512 


131072 


>2048 


16384 


>2048 



As can be seen, the AG741 -induced anti-bactericidal titre is particularly high against 
heterologous strain MC58. 



AG741 was also fused directly in-frame upstream of proteins 961, 961c, 983 and ORF46.1: 



AG741-961 

1 ATGGTCGCCG CCGACATCGG 
51 GC TCGAC CAT AAAGACAAAG 
101 TCAGGAAAAA CGAGAAACTG 
151 TATGGAAACG GTGACAGCCT 
2 01 CAGCCGTTTC GACTTTATCC 



TGCGGGGCTT GCCGATGCAC TAACCGCACC 
GTTTGCAGTC TTTGACGCTG GATCAGTCCG 
AAGCTGGCGG CACAAGGTGC GGAAAAAACT 
CAATACGGGC AAATTGAAGA ACGACAAGGT 
GCCAAATCGA AGTGGACGGG CAGCTCATTA 
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251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 



CCTTGGAGAG 
ACCGCCTTTC 
GGTTGCGAAA 
CTTTTGACAA 
TTCGGTTCAG 
CGCCAAGCAG 
ATGTCGACCT 
GTCATCAGCG 
CCTCGGTATC 
TGAAAAC C GT 
GAGGGTGGCG 
AGCTGCCACT 
ACGGTTTCAA 
ATTACCAAAA 
AGGTCTGGGT 
AAAACAAACA 
GAAAAGTTAA 
TGATGCCGCT 
AT AT AAC G AC 
GAAAAATTAG 
CAACGATATC 
CCGTCAAAAC 
AACGTCGATG 
AGCTGCCGCT 
CTGCAAAAGT 
ATTGCTAAAA 
CAGCAAATTT 
ACACACGCTT 
CTGAACGGTT 
AGGCCTTGCA 
TGGGTCGGTT 
GCAGTCGCCA 
AGCAGGCGTG 
TCGGCGTCAA 

MVAADIGAGL 
YGNGDSLNTG 
TAFQTEQIQD 
FGSDDAGGKL 
VISGSVLYNQ 
EGGGGTGSAT 
ITKKDATAAD 
EKLTTKLADT 
EKLEAVADTV 
NVDAKVKAAE 
IAKKANSADV 
LNGLDKTVSD 
AVAIGTGFRF 



TGGAGAGTTC 
AGACCGAGCA 
CGCCAGTTCA 
GCTTCCCGAA 
ACGATGCCGG 
GGAAACGGCA 
GGCCGCCGCC 
GTTCCGTCCT 
TTTGGCGGAA 
AAACGGCATA 
GAGGCACTGG 
GTGGCCATTG 
AGCTGGAGAG 
AAGACGCAAC 
CTGAAAAAAG 
AAACGTCGAT 
CAACCAAGTT 
CTGGATGCAA 
ATTTGCTGAA 
AAGCCGTGGC 
GCCGATTCAT 
CGCCAATGAA 
CCAAAGTAAA 
GGCACAGCTA 
TACCGACATC 
AAGCAAACAG 
GTCAGAATTG 
GGCTTCTGCT 
TGGATAAAAC 
GAACAAGCCG 
CAATGTAACG 
TCGGTACCGG 
GCAGTCGGCA 
TTACGAGTGG 

ADALTAPLDH 
KLKNDKVSRF 
SEHSGKMVAK 
TYTIDFAAKQ 
AEKGSYSLGI 
NDDDVKKAAT 
VEADDFKGLG 
DAALADTDAA 
DKHAE AFND I 
TAAGKAEAAA 
YTREESDSKF 
LRKETRQGLA 
TENFAAKAGV 



AG741-961C 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 



ATGGTCGCCG 
GCTCGACCAT 
TCAGGAAAAA 
TATGGAAACG 
CAGCCGTTTC 
CCTTGGAGAG 
ACCGCCTTTC 
GGTTGCGAAA 
CTTTTGACAA 
TTCGGTTCAG 
CGCCAAGCAG 
ATGTCGACCT 
GTCATCAGCG 
CCTCGGTATC 
TGAAAAC C GT 
GAGGGTGGCG 



CCGACATCGG 
AAAGACAAAG 
CGAGAAACTG 
GTGACAGCCT 
GACTTTATCC 
TGGAGAGTTC 
AGACCGAGCA 
CGCCAGTTCA 
GCTTCCCGAA 
ACGATGCCGG 
GGAAACGGCA 
GGCCGCCGCC 
GTTCCGTCCT 
TTTGGCGGAA 
AAACGGCATA 
GAGGCACTGG 



CAAGTATACA 
AATACAAGAT 
GAATCGGCGA 
GGCGGCAGGG 
CGGAAAACTG 
AAAT CGAAC A 
GATATCAAGC 
TTACAACCAA 
AAGCCCAGGA 
CGCCATATCG 
ATCCGCCACA 
CTGCTGCCTA 
AC CAT C T AC G 
TGCAGCCGAT 
TCGTGACTAA 
GCCAAAGTAA 
AGC AG AC AC T 
CCACCAACGC 
GAGACTAAGA 
TGATACCGTC 
TGGATGAAAC 
GCCAAACAGA 
AGCTGCAGAA 
ATACTGCAGC 
AAAGCTGATA 
TGCCGACGTG 
ATGGTCTGAA 
GAAAAATC C A 
AGTGTCAGAC 
CGCTCTCCGG 
GCTGCAGTCG 
CTTCCGCTTT 
CTTCGTCCGG 
CTCGAGCACC 

KDKGLQSLTL 
DFIRQIEVDG 
RQFRIGDIAG 
GNGKI EHIiK S 
FGGKAQEVAG 
VAIAAAYNNG 
LKKWTNLTK 
LDATTNALNK 
ADSLDETNTK 
GTANTAADKA 
VR I DGLNATT 
EQAALSGLFQ 
AVGTSSGSSA 



TGCGGGGCTT 
GTTTGCAGTC 
AAG CTGGCGG 
CAATACGGGC 
GCCAAATCGA 
CAAGTATACA 
AATACAAGAT 
GAATCGGCGA 
GGCGGCAGGG 
CGGAAAACTG 
AAATCGAACA 
GATATCAAGC 
TTACAACCAA 
AAGCCCAGGA 
CGCCATATCG 
ATCCGCCACA 



AACAAAGCCA 
TCGGAGCATT 
CATAGCGGGC 
CGACATATCG 
ACCTACACCA 
TTTGAAATCG 
CGGATGGAAA 
GC C GAGAAAG 
AGTTGCCGGC 
GCCTTGCCGC 
AACG AC GACG 
CAACAATGGC 
AC AT TGATGA 
GTTGAAGC CG 
C CTGAC C AAA 
AAGC TGCAGA 
GATGCCGCTT 
CTTGAATAAA 
CAAATATCGT 
GACAAGCATG 
C AAC AC T AAG 
CGGCCGAAGA 
ACTGCAGCAG 
CGACAAGGCC 
TCGCTACGAA 
TACACCAGAG 
CGCTACTACC 
TTGCCGATCA 
CTGCGCAAAG 
TCTGTTCCAA 
GCGGCTACAA 
ACCGAAAACT 
TTCTTCCGCA 
ACCACCACCA 

DQSVRKNEKL 
QLITLESGEF 
EHTSFDKkPE 
PELNVDLAAA 
SAEVKTVNGI 
QEINGFKAGE 
TWENKQNVD 
LGENITTFAE 
ADEAVKTANE 
EAVAAKVTDI 
EKLDTRLASA 
PYNVGRFNVT 
AYHVGVNYEW 



TTCCGCCTTA 
CCGGGAAGAT 
GAACATACAT 
CGGGACGGCG 
TAGATTTCGC 
CCAGAACTCA 
ACGCCATGCC 
GCAGTTACTC 
AGCGCGGAAG 
C AAGC AAC TC 
AT GTT AAAAA 
C AAGAAAT C A 
AGACGGCACA 
AC G AC TT T AA 
ACCGTCAATG 
ATCTGAAATA 
TAGCAGATAC 
TTGGGAGAAA 
AAAAATT GAT 
CCGAAGCATT 
GCAGACGAAG 
AACCAAACAA 
GCAAAGCCGA 
GAAGCTGTCG 
CAAAGATAAT 
AAGAGTC TG A 
GAAAAATTGG 
C GAT ACT C GC 
AAACCCGCCA 
C C TT AC AAC G 
ATCCGAATCG 
TTGCCGCCAA 
GCCTACCATG 
CCACTGA 

KLAAQGAEKT 
QVYKQSHSAL 
GGRATYRGTA 
DIKPDGKRHA 
RHIGLAAKQL 
TIYDIDEDGT 
AKVKAAESEI 
ETKTNIVKID 
AKQTAEETKQ 
KADIATNKDN 
EKSIADHDTR 
AAVGGYKSES 
LEHHHHHH* 



GC CGATGC AC 
TTTGACGC TG 
CACAAGGTGC 
AAATTGAAGA 
AGTGGACGGG 
AACAAAGCCA 
TCGGAGCATT 
CATAGCGGGC 
CGACATATCG 
ACCTACACCA 
TTTGAAATCG 
CGGATGGAAA 
GCCGAGAAAG 
AGTTGCCGGC 
GCCTTGCCGC 
AACGAC GACG 



TAACCGCACC 
GATCAGTCCG 
GGAAAAAACT 
AC GAC AAGGT 
CAGCTCATTA 
TTCCGCCTTA 
CCGGGAAGAT 
GAACATACAT 
CGGGACGGCG 
TAGATTTCGC 
CCAGAACTCA 
ACGCCATGCC 
GCAGTTACTC 
AGCGCGGAAG 
CAAGCAACTC 
ATGT TAAAAA 
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801 AGCTGCCACT GTGGCCATTG CTGCTGCCTA CAACAATGGC C AAGAAAT C A 

851 ACGGTTTCAA AGC TGGAGAG ACCATCTACG AC AT TGATGA AGAC GGC AC A 

9 01 ATTACCAAAA AAGACGCAAC TGCAGCCGAT GTTGAAGCCG ACGACTTTAA 

951 AGGTCTGGGT CTGAAAA&AG TCGTGACTAA C C TGAC C AAA ACCGTCAATG 

5 1001 AAAACAAACA AAACGTCGAT GCCAAAGTAA AAGCTGCAGA ATCTGAAATA 

1051 GAAAAGTTAA CAACCAAGTT AGC AGAC AC T GATGCCGCTT TAGCAGATAC 

1101 TGATGCCGCT CTGGATGCAA CCACCAACGC CTTGAATAAA TTGGGAGAAA 

1151 ATATAACGAC ATTTGCTGAA GAGACTAAGA CAAATATCGT AAAAATTGAT 

1201 G AAAAAT TAG AAGCCGTGGC TGATACCGTC GACAAGCATG CCGAAGCATT 

10 1251 CAACGATATC GCCGATTCAT TGGATGAAAC CAACACTAAG GCAGACGAAG 

13 01 CCGTCAAAAC CGCCAATGAA GCCAAACAGA CGGC CGAAGA AACCAAACAA 

1351 AACGTCGATG C C AAAGT AAA AGCTGCAGAA ACTGCAGCAG GC AAAGCCGA 

1401 AGCTGCCGCT GGC AC AGC T A ATACTGCAGC CGACAAGGCC GAAGCTGTCG 

1451 CTGCAAAAGT TACCGACATC AAAGCTGATA TCGCTACGAA CAAAGATAAT 

15 1501 ATTGCTAAAA AAGCAAACAG TGCCGACGTG TACACCAGAG AAGAGTC TGA 

1551 CAGCAAATTT GTCAGAATTG ATGGTCTGAA CGCTACTACC GAAAAAT TGG 

1601 ACACACGCTT GGCTTCTGCT GAAAAATCCA TTGCCGATCA C GAT AC T C GC 

1651 CTGAACGGTT TGGATAAAAC AGTGTCAGAC CTGCGCAAAG AAACCCGCCA 

17 01 AGGCCTTGCA GAACAAGCCG CGCTCTCCGG TCTGTTCCAA CCTTACAACG 

20 1751 TGGGTCTCGA GCACCACCAC CACCACCACT GA 

1 MVAADIGAGL ADALTAPLDH KDKGLQSLTL DQSVRKNEKL KLAAQGAEKT 

51 YGNGDSLNTG KLKNDKVSRF DFIRQIEVDG QLITLESGEF QVYKQSHSAL 

101 TAFQTEQIQD SEHSGKMVAK RQFRIGDIAG EHTSFDKLPE GGRATYRGTA 

25 151 FGSDDAGGKL TYTIDFAAKQ GNGKIEHLKS PELWDLAAA DIKPDGKRHA 

201 VISGSVLYNQ AEKGSYSLGI FGGKAQEVAG SAEVKTVNGI RHIGLAAKQL 

2 51 EGGGGTGSAT NDDDVKKAAT VAIAAAYNNG QEINGFKAGE T I YD I DEDGT 

3 01 ITKKDATAAD VEADDFKGLG LKKWTNLTK TVNENKQNVD AKVKAAESEI 
3 51 EKLTTKLADT DAALADTDAA LDATTNALNK LGENITTFAE ETKTNIVKID 

30 401 EKLEAVADTV DKHAE AFND I ADSLDETNTK ADEAVKTANE AKQTAEETKQ 

451 NVDAKVKAAE TAAGKAEAAA GTANTAADKA EAVAAKVTD I KADIATNKDN 

5 01 IAKKANSADV YTREESDSKF VRIDGLNATT EKLDTRLASA EKSIADHDTR 

551 LNGLDKTVSD LRKETRQGLA EQAALSGLFQ PYNVGLEHHH HHH* 



35 



AG741-983 



1 ATGGTCGCCG CCGACATCGG TGCGGGGCTT GCCGATGCAC TAACCGCACC 

51 GCTCGACCAT AAAGACAAAG GTTTGCAGTC TTTGACGCTG GATCAGTCCG 

101 TCAGGAAAAA CGAGAAACTG AAGCTGGCGG CACAAGGTGC GGAAAAAACT 

40 151 TATGGAAACG GTGACAGCCT CAATACGGGC AAATTGAAGA ACGACAAGGT 

2 01 CAGCCGTTTC GACTTTATCC GCCAAATCGA AGTGGACGGG CAGCTCATTA 

2 51 CCTTGGAGAG TGGAGAGTTC CAAGTATACA AACAAAGCCA TTCCGCCTTA 

3 01 ACCGCCTTTC AGACCGAGCA AATACAAGAT TCGGAGCATT CCGGGAAGAT 
3 51 GGTTGCGAAA CGCCAGTTCA GAATCGGCGA CATAGCGGGC GAACATACAT 

45 401 CTTTTGACAA GCTTCCCGAA GGCGGCAGGG CGACATATCG CGGGACGGCG 

451 TTCGGTTCAG ACGATGCCGG CGGAAAACTG ACCTACACCA TAGATTTCGC 

501 CGCCAAGCAG GGAAACGGCA AAATCGAACA TTTGAAATCG CCAGAACTCA 

551 ATGTCGACCT GGCCGCCGCC GATATCAAGC CGGATGGAAA ACGCCATGCC 

601 GTCATCAGCG GTTCCGTCCT TTACAACCAA GCCGAGAAAG GCAGTTACTC 

50 651 CCTCGGTATC TTTGGCGGAA AAGCCCAGGA AGTTGCCGGC AGCGCGGAAG 

7 01 TGAAAACCGT AAACGGCATA CGCCATATCG GCCTTGCCGC CAAGCAACTC 
751 GAGGGATCCG GCGGAGGCGG CACTTCTGCG CCCGACTTCA ATGCAGGCGG 

8 01 TACCGGTATC GGCAGCAACA G C AG AGC AAC AACAGC GAAA TCAGCAGCAG 
851 TATCTTACGC CGGTATCAAG AACGAAATGT GCAAAGACAG AAGCATGCTC 

55 9 01 TGTGCCGGTC GGGATGACGT TGCGGTTACA GACAGGGATG CCAAAATCAA 

951 TGCCCCCCCC CCGAATCTGC AT AC CGGAGA CTTTCCAAAC CCAAATGACG 

1001 CATACAAGAA TTTGATCAAC CTCAAACCTG CAATTGAAGC AGGCTATACA 

1051 GGACGC GGGG TAGAGGTAGG TATCGTCGAC ACAGGCGAAT CCGTCGGCAG 

1101 CATATCCTTT CCCGAACTGT ATGGCAGAAA AGAACACGGC TAT AAC GAAA 

60 1151 ATTACAAAAA CTATACGGCG TATATGCGGA AGGAAGCGCC TGAAGACGGA 

12 01 GGCGGTAAAG ACATTGAAGC TTCTTTCGAC GATGAGGCCG TTATAGAGAC 

12 51 TGAAGCAAAG CCGACGGATA TCCGCCACGT AAAAGAAATC GGACACATCG 

13 01 ATTTGGTCTC CCATATTATT GGCGGGCGTT CCGTGGACGG CAGACCTGCA 
1351 GGCGGTATTG CGCCCGATGC GACGC T AC AC ATAATGAATA CGAATGATGA 

65 1401 AACCAAGAAC GAAATGATGG TTGCAGCCAT CCGCAATGCA TGGGTCAAGC 

1451 TGGGCGAACG TGGCGTGCGC ATCGTCAATA ACAGTTTTGG AACAACATCG 

15 01 AGGGCAGGCA CTGCCGACCT TTTCCAAATA GCCAATTCGG AGGAGCAGTA 
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1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 
3301 
3351 
3401 
3451 
3501 
3551 
3601 
3651 
3701 
3751 
3801 
3851 
3901 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 



CCGCCAAGCG 
TCCGCCTGAT 
AATAAAAACA 
GCCCAACACA 
GCATTATCAC 
GAAATGTATG 
TTGCGGAATT 
TCCGTTTCAC 
GCACCCATCG 
GATGAGCAAC 
TCGGTGCAGT 
GGTAAGGC C A 
CGATACGAAA 
CAGGCACGGG 
GGCAACAACA 
GTTGTACGGC 
TGATT T AT AA 
GTCTATCTGG 
CAAAGGCAGT 
GCAAACTGCT 
ATGTCGGCAC 
TGTTCCCTTC 
CAAACATCGA 
AAAACAGCGG 
CAATGCGGCA 
TGAAACACGC 
GAACTGGATG 
GGCAGCCGAC 
TCCGCGCAGC 
ATCTTCAACA 
TGCCGATATG 
ACAACGGCAC 
ACGTGGGAAC 
CGTCGGCATT 
TGGGCATGGG 
GACAGCATTA 
CTATCTCAAA 
GCAGCACCGG 
ATGCAGCTGG 
AGATTTGACG 
CATTCGCCGA 
GAAGGCACGC 
CGATAAAGCC 
GACGCGACTA 
GGCAAGACGG 
GGGCGCGGAT 
GCTACGCCGG 
GGCTACCGGT 

MVAAD I G AGL 
YGNGDSLNTG 
TAFQTEQIQD 
FGSDDAGGKL 
VISGSVLYNQ 
EGSGGGGTSA 
CAGRDDVAVT 
GRGVEVG1VD 
GGKDIEASFD 
GGIAPDATLH 
RAGTADLFQI 
NKNMLFIFST 
EMYGEPGTEP 
APIVTGTAAL 
GKAMNGPASF 
GNNTYTGKTI 
VYLADTDQSG 
MSARGKGAGY 



TTGCTCGACT 
GCAACAGAGC 
TGCTTTTCAT 
TATGCC CTAT 
AGTCGCAGGC 
GAGAAC CGGG 
ACTGCCATGT 
CCGTACAAAC 
TAACCGGCAC 
GACAACCTGC 
CGGCGTGGAC 
TGAACGGACC 
GGTACATCCG 
CGGCCTGATC 
CCTATACGGG 
AACAACAAAT 
CGGGGCGGCA 
CAGATACCGA 
CTGCAGCTGG 
GAAAGTGGAC 
GCGGCAAGGG 
CTGAGTGCCG 
AAC CGACGGC 
GCAGTGAAGG 
CGGACTGCTT 
CGTAGAACAG 
CCTCCGAATC 
CGCACAGATA 
GGCAGCCGTA 
GTCTCGCCGC 
CAGGGACGCC 
GGGTCTGCGC 
AGGGCGGTGT 
GCCGCGAAAA 
ACGCAGCACA 
GTCTGTTTGC 
GGCCTGTTCT 
TGCGGACGAA 
GCGCACTGGG 
GTCGAAGGCG 
AAAAGGCAGT 
TGGTCGGACT 
GTCCTGTTTG 
CACGGTAACG 
GGGCACGCAA 
GTCGAATTCG 
TTCCAAACAG 
TCCTCGAGCA 

ADALTAPLDH 
KLKNDKVSRF 
SEHSGKWAK 
TYTIDFAAKQ 
AEKGSYSLGI 
PDFNAGGTGI 
DRDAKINAPP 
TGESVGSISF 
DE AVI ETE AK 
IMNTNDETKN 
ANSEEQYRQA 
GNDAQAQPNT 
LEYGSNHCGI 
LLQKYPWMSN 
PFGDFTADTK 
IEGGSLVLYG 
ANETVHIKGS 
LNSTGRRVPF 



ATTCCGGCGG 
GATTACGGCA 
CTTTTCGACA 
TGCCATTTTA 
GTAGACCGCA 
TACAGAACCG 
GGTGCCTGTC 
CCGATTCAAA 
GGCGGCTCTG 
GTACCACGTT 
AGCAAGTTCG 
CGCGTCCTTT 
AT AT TGCCT A 
AAAAAAGGCG 
CAAAACCATT 
CGGATATGCG 
TCCGGCGGCA 
CCAATCCGGC 
ACGGCAAAGG 
GGTACGGCGA 
GGCAGGCTAT 
CCAAAATCGG 
GGCCTGCTGG 
CGACACGCTG 
CGGCAGCGGC 
GGCGGCAGCA 
ATCCGCAACA 
TGCCGGGCAT 
CAGCATGCGA 
T AC C GT CTAT 
GCCTGAAAGC 
GTCATCGCGC 
TGAAGGCAAA 
CCGGCGAAAA 
TGGAGCGAAA 
AGGCATACGG 
CCTACGGACG 
CATGCGGAAG 
CGGTGTCAAC 
GTCTGCGCTA 
GCTTTGGGCT 
CGCGGGTCTG 
CAACGGCGGG 
GGCGGCTTTA 
TATGCCGCAC 
GCAACGGCTG 
TACGGCAACC 
CCACCACCAC 

KDKGLQSLTL 
DFIRQIEVDG 
RQFRIGDIAG 
GNGKIEHLKS 
FGGKAQEVAG 
GSNSRATTAK 
PNLHTGDFPN 
PELYGRKEHG 
PTDIRHVKEI 
EMMVAAIRNA 
LLDYSGGDKT 
YALLPFYEKD 
TAMWCLSAPY 
DNLRTTLLTT 
GTSDIAYSFR 
NNKSDMRVET 
LQLDGKGTBY 
LSAAKIGQDY 



TGATAAAACA 
ACCTGTCCTA 
GGCAATGACG 
TGAAAAAGAC 
GTGGAGAAAA 
CTTGAGTATG 
GGCACCCTAT 
TTGCCGGAAC 
CTGCTGCAGA 
GCTGACGACG 
GCTGGGGACT 
CCGTTCGGCG 
CTCCTTCCGT 
GCAGCCAACT 
ATCGAAGGCG 
CGTCGAAACC 
GCCTGAACAG 
GCAAACGAAA 
TACGCTGTAC 
TTATCGGCGG 
CTCAACAGTA 
GCAGGATTAT 
CTTCCCTCGA 
TCCTATTATG 
ACATTCCGCG 
ATCTGGAAAA 
CCCGAGACGG 
CCGCCCCTAC 
ATGCCGCCGA 
GCCGACAGTA 
CGTATCGGAC 
AAACCCAACA 
ATGCGCGGCA 
TACGACAGCA 
ACAGTGCAAA 
CACGATGCGG 
CTACAAAAAC 
GCAGCGTCAA 
GTTCCGTTTG 
CGACCTGCTC 
GGAGCGGCAA 
AAGCTGTCGC 
CGTGGAACGC 
CCGGCGCGAC 
ACCCGTCTGG 
GAACGGCTTG 
ACAGCGGACG 
CACCACTGA 

DQSVRKNEKL 
QLITLESGEF 
EHTSFDKLPE 
PELNVDLAAA 
SAEVKTVNGI 
SAAVSYAGIK 
PNDAYKNLIN 
YNENYKNYTA 
GHIDLVSHII 
WVKLGERGVR 
DEGIRLMQQS 
AQKGIITVAG 
EASVRFTRTN 
AQDIGAVGVD 
NDISGTGGLI 
KGALIYNGAA 
TRLGKLLKVD 
SFFTNIETDG 



GACGAGGGTA 
CCACATCCGT 
CACAAGCTCA 
GCTCAAAAAG 
GTTCAAACGG 
GCTCCAACCA 
GAAGCAAGCG 
ATCCTTTTCC 
AATACCCGTG 
GCTCAGGACA 
GCTGGATGCG 
ACTTTACCGC 
AACGACATTT 
GC AAC TGC AC 
GTTCGCTGGT 
AAAGGTGCGC 
CGACGGC ATT 
CCGTACACAT 
ACACGTTTGG 
C AAGC TGTAC 
CCGGACGACG 
TCTTTCTTCA 
CAGCGTCGAA 
TCCGTCGCGG 
CCCGCCGGTC 
CCTGATGGTC 
TTGAAACTGC 
GGCGCAACTT 
CGGTGTACGC 
CCGCCGCCCA 
GGGTTGGACC 
GGACGGTGGA 
GTACCCAAAC 
GCCGCCACAC 
TGCAAAAACC 
GCGATATCGG 
AGCATCAGCC 
CGGCACGCTG 
CCGCAACGGG 
AAACAGGATG 
CAGCCTCACT 
AACCCTTGAG 
GACCTGAACG 
TGCAGCAACC 
TTGCCGGCCT 
GCACGTTACA 
AGTCGGCGTA 



KLAAQGAEKT 
QVYKQSHSAL 
GGRATYRGTA 
DIKPDGKRHA 
RHIGLAAKQL 
NEMCKDRSML 
LKPAIEAGYT 
YMRKEAPEDG 
GGRSVDGRPA 
IVXsINSFGTTS 
DYGNLSYHIR 
VDRSGEKFKR 
PIQ1AGTSFS 
SKFGWGLLDA 
KKGGSQLQLH 
SGGSLNSDGI 
GTAIIGGKLY 
GLLASLDSVE 
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901 KTAGSEGDTL SYYVRRGNAA 

951 ELDASESSAT PETVETAAAD 

1001 IFNSLAATVY ADSTAAHADM 

1051 TWEQGGVEGK MRGSTQTVGI 

1101 DSISLFAGIR HDAGDIGYLK 

1151 MQLGALGGVN VPFAATGDLT 

12 01 EGTLVGLAGL KLSQPLSDKA 
1251 GKTGARNMPH TRLVAGLGAD 

13 01 GYRF LEHHHH HH* 



RTASAAAHSA PAGLKHAVEQ GGSNLEMLMV 
RTDMPGIRPY GATFRAAAAV QHANAADGVR 
QGRRLKAVSD GLDHNGTGLR VIAQTQQDGG 
AAKTGENTTA AATLGMGRST WSENSANAKT 
GLFSYGRYKN SISRSTGADE HAEGSVNGTL 
VEGGLRYDLL KQDAFAEKGS ALGWSGNSLT 
VLFATAGVER DLNGRDYTVT GGFTGATAAT 
VEFGNGWNGL ARYSYAGSKQ YGNHSGRVGV 



AG741-ORF46.1 

1 ATGGTCGCCG CCGACATCGG 

51 GCTCGACCAT AAAGACAAAG 

101 TCAGGAAAAA CGAGAAACTG 

151 TATGGAAACG GTGACAGCCT 

2 01 CAGCCGTTTC GACTTTATCC 

2 51 CCTTGGAGAG TGGAGAGTTC 

301 ACCGCCTTTC AGACCGAGCA 

351 GGTTGCGAAA CGCCAGTTCA 

401 CTTTTGACAA GCTTCCCGAA 

451 TTCGGTTCAG ACGATGCCGG 

501 CGCCAAGCAG GGAAACGGCA 

551 ATGTCGACCT GGCCGCCGCC 

601 GTCATCAGCG GTTCCGTCCT 

651 CCTCGGTATC TTTGGCGGAA 

701 TGAAAACCGT AAACGGCATA 

7 51 GACGGTGGCG GAGGCACTGG 

801 CCGGCAGGTT CTCGACCGTC 

851 TATTCGGCAG CAGGGGGGAA 

901 GGAAAAATAC AAAGC C AT C A 

951 CAT T AAAGG A AATATCGGCT 

10 01 AAGTCCATTC CCCCTTCGAC 

1051 GCCGGTAGTC CCGTTGACGG 

1101 ATACGAACAC CATCCCGCCG 

1151 ATCCCGCTCC CAAAGGCGCG 

12 01 GTTGCCCAAA ATATCCGCCT 
1251 ACGGCTTGCC GACCGTTTCC 

13 01 TAGGCGACGG ATTCAAACGC 
1351 TCGGGCAATG CCGCCGAAGC 
1401 CATCATCGGC GCGGCAGGAG 
1451 GCATAAGCGA AGGCTCAAAC 
1501 TCCACCGAAA ACAAGATGGC 
1551 ACTCAAAGAC TATGCCGCAG 
1601 CCAATGCCGC ACAAGGCATA 
1651 ATCCCCATCA AAGGGATTGG 
17 01 CATC AC GGC A CATCCTATCA 
17 51 CGAAAGGGAA ATCCGCCGTC 
1801 AAATACCCGT CCCCTTACCA 
1851 GCGTTACGGC AAAGAAAACA 
19 01 GCAAAAATGT CAAACTGGCA 
1951 TTTGACGGTA AAGGGTTTCC 
2 0 01 GCTCGAGCAC CACCACCACC 



TGCGGGGCTT GCCGATGCAC TAACCGCACC 
GTTTGCAGTC TTTGACGCTG GATCAGTCCG 
AAGCTGGCGG CACAAGGTGC GGAAAAAACT 
CAATACGGGC AAATTGAAGA ACGACAAGGT 
GCCAAATCGA AGTGGACGGG CAGCTCATTA 
CAAGTATACA AAC AAAGC C A TTCCGCCTTA 
AATACAAGAT TCGGAGCATT CCGGGAAGAT 
GAATCGGCGA CATAGCGGGC GAACATACAT 
GGCGGCAGGG CG AC AT AT C G CGGGACGGCG 
CGGAAAACTG ACCTACACCA TAGATTTCGC 
AAATC GAAC A TTTGAAATCG CCAGAACTCA 
G AT AT C AAGC CGGATGGAAA ACGCCATGCC 
TTACAACCAA GCCGAGAAAG GCAGTTACTC 
AAGCCCAGGA AGTTGCCGGC AGCGCGGAAG 
CGCCATATCG GCCTTGCCGC CAAGCAACTC 
ATCCTCAGAT TTGGCAAACG ATT C TTTTAT 
AGCATTTCGA AC CCGACGGG AAATACCACC 
CTTGCCGAGC GCAGCGGCCA TATCGGATTG 
GTTGGGCAAC CTGATGATTC AACAGGCGGC 
ACATTGTCCG CTTTTCCGAT CACGGGCACG 
AACCATGCCT CACATTCCGA TTCTGATGAA 
ATTTAGCCTT TACCGCATCC ATTGGGACGG 
ACGGCTATGA CGGGC C AC AG GGCGGCGGCT 
AGGGATATAT ACAGCTACGA CATAAAAGGC 
CAACCTGACC GACAACCGCA GCACCGGACA 
ACAATGCCGG TAGTATGCTG ACGCAAGGAG 
GCCACCCGAT ACAGCCCCGA GCTGGACAGA 
CTTCAACGGC ACTGCAGATA TCGTTAAAAA 
AAATTGTCGG CGCAGGCGAT GCCGTGCAGG 
ATTGCTGTCA TGCACGGCTT GGGTCTGCTT 
GCGCATCAAC GATTTGGCAG ATATGGCGCA 
CAGCCATCCG CGATTGGGCA GTCCAAAACC 
GAAGCCGTCA GCAATATCTT TATGGCAGCC 
AGCTGTTCGG GGAAAAT AC G GCTTGGGCGG 
AGCGGTCGCA GATGGGCGCG ATCGCATTGC 
AGCGACAATT TTGCCGATGC GGCATACGCC 
TTCCCGAAAT ATCCGTTCAA ACTTGGAGCA 
TCACCTCCTC AACCGTGCCG CCGTCAAACG 
GACCAACGCC AC CC GAAGAC AGGCGTACCG 
GAAT T TTGAG AAGCACGTGA AATATGATAC 
ACCACTGA 



1 MVAAD I G AGL ADALTAPLDH 

51 YGNGDSLNTG KLKNDKVSRF 

101 TAFQTEQIQD SEHSGKMVAK 

151 FGSDDAGGKL TYTIDFAAKQ 

2 01 VISGSVLYNQ AEKGSYSLGI 

2 51 DGGGGTGSSD LANDSFIRQV 

3 01 GKIQSHQLGN LMIQQAAIKG 
351 AGSPVDGFSL YRIHWDGYEH 
401 VAQNIRLNLT DNRSTGQRLA 
451 SGNAAEAFNG TADIVKNIIG 
501 STENKMARIN DLADMAQLKD 
551 IPIKGIGAVR GKYGLGGITA 
601 KYPSPYHSRN IRSNLEQRYG 
651 FDGKGFPNFE KHVKYDTLEH 



KDKGLQSLTL DQSVRKNEKL KLAAQGAEKT 
DFIRQIEVDG QLITLESGEF QVYKQSHSAL 
RQFRIGDIAG EHTSFDKLPE GGRATYRGTA 
GNGKIEHDKS PELNVDLAAA DIKPDGKRHA 
FGGKAQEVAG SAEVKTVNGI RHIGLAAKQL 
L.DRQHFEPDG KYHLFGSRGE LAERSGHIGL 
NIGYIVRFSD HGHEVHSPFD NHASHSDSDE 
HPADGYDGPQ GGGYPAPKGA RDIYSYDIKG 
DRFHNAGSML TQGVGDGFKR ATRYSPELDR 
AAGE I VGAGD AVQGISEGSN IAVMHGLGIiL 
YAAAAI RDWA VQNPNAAQGI EAVSNIFMAA 
HPIKRSQMGA IALPKGKSAV SDNFADAAYA 
KENITSSTVP PSNGKNVKLA DQRHPKTGVP 
HHHHH* 
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Example 16 - C-terminal fusions ('hybrids 9 ) with 287/AG287 

According to the invention, hybrids of two proteins A & B may be either NH 2 -A-B-COOH 
or NH 2 -B-A-COOH. The effect of this difference was investigated using protein 287 either 
C-terminal (in '287-His' form) or N-terrninal (in AG287 form — sequences shown above) to 
5 919, 953 and ORF46.1. A panel of strains was used, including homologous strain 2996. FCA 
was used as adjuvant: 





287 & 919 


287 & 953 


287 & ORF46.1 


Strain 


JG287-919 


919-287 


AG287-953 


953-287 


AG287-46.1 


46.1-287 


2996 


128000 


16000 


65536 


8192 


16384 


8192 


BZ232 


256 


128 


128 


<4 


<4 


<4 


1000 


2048 


<4 


<4 


<4 


<4 


<4 


MC58 


8192 


1024 


16384 


1024 


512 


128 


NGH38 


32000 


2048 


>2048 


4096 


16384 


4096 


394/98 


4096 


32 


256 


128 


128 


16 


MenA (F6124) 


32000 


2048 


>2048 


32 


8192 


1024 


MenC (BZ133) 


64000 


>8192 


>8192 


<16 


8192 


2048 



Better bactericidal titres are generally seen with 287 at the N-terminus (in the AG form) 



When fused to protein 961 [NH 2 -AG287-961-COOH - sequence shown above], the resulting 
protein is insoluble and must be denatured and renatured for purification. Following 
10 renaturation, around 50% of the protein was found to remain insoluble. The soluble and 
insoluble proteins were compared, and much better bactericidal titres were obtained with the 
soluble protein (FCA as adjuvant): 





2996 


BZ232 


MC58 


NGH38 


F6124 


BZ133 


Soluble 


65536 


128 


4096 


>2048 


>2048 


4096 


Insoluble 


8192 


<4 


<4 


16 


n.d. 


n.d. 



Titres with the insoluble form were, however, improved by using alum adjuvant instead: 



Insoluble 


32768 


128 


4096 


>2048 


>2048 


2048 



Example 17- N -terminal fusions ('hybrids') to 287 

Expression of protein 287 as full-length with a C-terminal His-tag, or without its leader 
peptide but with a C-terminal His-tag, gives fairly low expression levels. Better expression is 
achieved using a N-terminal GST-fusion. 



WO 01/64922 



PCT/IB01/00452 



-55- 

As an alternative to using GST as an N-terminal fusion partner, 287 was placed at the 
C-terminus of protein 919 ('919-287'), of protein 953 ('953-287'), and of proteins ORF46.1 
('ORF46. 1-287'). In both cases, the leader peptides were deleted, and the hybrids were direct 
in-frame fusions. 

5 To generate the 953-287 hybrid, the leader peptides of the two proteins were omitted by 
designing the forward primer downstream from the leader of each sequence; the stop codon 
sequence was omitted in the 953 reverse primer but included in the 287 reverse primer. For 
the 953 gene, the 5' and the 3' primers used for amplification included a Ndel and a BamRI 
restriction sites respectively, whereas for the amplification of the 287 gene the 5' and the 3' 
10 primers included a BamRI and a Xhol restriction sites respectively. In this way a sequential 
directional cloning of the two genes in pET21b+, using Ndel-BaniHI (to clone the first gene) 
and subsequently BamBI-XhoI (to clone the second gene) could be achieved. 

The 919-287 hybrid was obtained by cloning the sequence coding for the mature portion of 
287 into the Xhol site at the 3'-end of the 919-His clone in pET21b+. The primers used for 
15 amplification of the 287 gene were designed for introducing a Sail restriction site at the 5'- 
and a Xhol site at the 3'- of the PGR fragment. Since the cohesive ends produced by the Sail 
and Xhol restriction enzymes are compatible, the 287 PCR product digested with Sall-Xhol 
could be inserted in the pET21b-919 clone cleaved with Xhol. 

The ORF46. 1-287 hybrid was obtained similarly. 

20 The bactericidal efficacy (homologous strain) of antibodies raised against the hybrid proteins 
was compared with antibodies raised against simple mixtures of the component antigens: 





Mixture with 287 


Hybrid with 287 


919 


32000 


16000 


953 


8192 


8192 


ORF46.1 


128 


8192 



Data for bactericidal activity against heterologous MenB strains and against serotypes A and 
C were also obtained for 919-287 and 953-287: 
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919 


953 


ORF46.1 


o train 


iviixiure 


rlyuYlu, 


iviixiure 




A// y -1-7^7 » i"/? 

ikz ixiurc 


Lxyur Lti 




512 


1024 


512 


1024 




1024 


NGH38 


1024 


2048 


2048 


4096 




4096 


BZ232 


512 


128 


1024 


16 






MenA (F6124) 


512 


2048 


2048 


32 




1024 


MenC (Cll) 


>2048 


n.d. 


>2048 


n.d. 




n.d. 


MenC (BZ133) 


>4096 


>8192 


>4096 


<16 




2048 



Hybrids of ORF46.1 and 919 were also constructed. Best results (four-fold higher titre) were 
achieved with 919 at the N-terminus. 

Hybrids 919-519His, ORF97-225His and 225-ORF97His were also tested. These gave 
moderate ELISA fitres and bactericidal antibody responses. 

5 Example 18 - the leader peptide from ORF4 

As shown above, the leader peptide of ORF4 can be fused to the mature sequence of other 
proteins (e.g. proteins 287 and 919). It is able to direct lipidation in E.coli. 

Example 19 - domains in 564 

The protein c 564' is very large (2073aa), and it is difficult to clone and express it in complete 
10 form. To facilitate expression, the protein has been divided into four domains, as shown in 
figure 8 (according to the MC58 sequence): 



Domain 


A 


B 


C 


D 


Amino Acids 


79-360 


361-731 


732-2044 


2045-2073 



These domains show the following homologies: 

• Domain A shows homology to other bacterial toxins: 

gb | AAG03 431 . 1 | AE004443_9probable hemagglutinin [Pseudomonas aeruginosa] (38%) 
15 gb | AAC31981.1) (139897) HecA [Pectobacterium chrysanthemi ] (45%) 

emb|CAA3 6409.l| (X52156) filamentous hemagglutinin [Bordetella pertussis] (31%) 
gb | AAC79757 . 1 | (AF057695) large supernatant proteinl [Haemophilus ducreyi] (26%) 
gb|AAA25657.l| (M30186) HpmA precursor [Proteus mirabilis] (29%) 

20 •Domain B shows no homology, and is specific to 564. 

• Domain C shows homology to: 

gb AAF84995.l|AE004032 HA-like secreted protein [Xylella fastidiosa] (33%) 
gb AAG05850 . 1 j AE004673 hypothetical protein [Pseudomonas aeruginosa] (27%) 
gb AAF68414.1AF237928 putative FHA [Pasteurella multocisida] (23%) 
25 gb AAC79757 . 1 | (AF057695) large supernatant proteinl [Haemophilus ducreyi] (23%) 

pir| |S21010 FHA B precursor [Bordetella pertussis] (20%) 
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• Domain D shows homology to other bacterial toxins: 

gb|AAF84995.llAE004032_14 HA-like secreted protein [Xylella fastidiosa] (29%) 

Using the MC58 strain sequence, good intracellular expression of 564ab was obtained in the 
5 form of GST-fusions (no purification) and his-tagged protein; this domain-pair was also 
expressed as a lipoprotein, which showed moderate expression in the outer membrane/ 
supernatant fraction. 

The b domain showed moderate intracellular expression when expressed as a his-tagged 
product (no purification), and good expression as a GST-fusion. 

10 The c domain showed good intracellular expression as a GST-fusion, but was insoluble. The 
d domain showed moderate intracellular expression as a his-tagged product (no purification). 
The cd protein domain-pair showed moderate intracellular expression (no purification) as a 
GST-fusion, 

Good bactericidal assay titres were observed using the c domain and the be pair. 

1 5 Example 20 - the 919 leader peptide 

The 20mer leader peptide from 919 is discussed in example 1 above: 

MKKYLFRAAL YG I AAA I L AA 

As shown in example 1, deletion of this leader improves heterologous expression, as does 
20 substitution with the ORF4 leader peptide. The influence of the 919 leader on expression 
was investigated by fusing the coding sequence to the PhoC reporter gene from Morganella 
morganii [Thaller et al (1994) Microbiology 140:1341-1350]. The construct was cloned in 
the pET21-b plasmid between the Ndel and Xhol sites (Figure 9): 

1 MKKYLFRAAL YG I AAA I L AA AIPAGNDATT KPDLYYLKNE QAIDSLKLLP 

25 51 PPPEVGSIQF LNDQAMYEKG RMLRNTERGK QAQADADLAA GGVATAFSGA 

101 FGYPITEKDS PELYKLLTNM I EDAGDL ATR SAKEHYMRIR PFAFYGTETC 

151 NTKDQKKLST NGSYPSGHTS IGWATALVLA EVUPANQDAI LERGYQLGQS 

201 RVICGYHWQS DVDAARIVGS AAVATLHSDP AFQAQLAKAK QEFAQKSQK* 

30 The level of expression of PhoC from this plasmid is >200-fold lower than that found for the 
same construct but containing the native PhoC signal peptide. The same result was obtained 
even after substitution of the T7 promoter with the E.coli Plac promoter. This means that the 
influence of the 919 leader sequence on expression does not depend on the promoter used. 

In order to investigate if the results observed were due to some peculiarity of the 919 signal 
35 peptide nucleotide sequence (secondary structure formation, sensitivity to RNAases, etc.) or 
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to protein instability induced by the presence of this signal peptide, a number of mutants 
were generated. The approach used was a substitution of nucleotides of the 919 signal 
peptide sequence by cloning synthetic linkers containing degenerate codons. In this way, 
mutants were obtained with nucleotide and/or amino acid substitutions. 



Two different linkers were used, designed to produce mutations in two different regions of 
the 919 signal peptide sequence, in the first 19 base pairs (LI) and between bases 20-36 (SI). 

Ll: 5' T ATG AAa/g TAc/t c/tTN TTt/c a/cGC GCC GCC CTG TAC GGC ATC GCC GCC 
GCC ATC CTC GCC GCC GCG ATC CC 3' 

SI: 5' T ATG AAA AAA TAC CTA TTC CGa/g GCN GCN c/tTa/g TAc/t GGc/g ATC GCC 
GCC GCC ATC CTC GCC GCC GCG ATC CC 3' 

The alignment of some of the mutants obtained is given below. 

Ll mutants: 

9 L 1 - a ATGAAGAAGTACCTTTTCAGCGCCGCC ~ 

9Ll-e ATGAAAAAATACTTTTTCCGCGCCGCC — 

9L 1 - d ATGAAAAAATACTTTTTCCGCGCCGCC 

9Ll-f ATGAAAAAATATCTCTTTAGCGCCGCCCTGTACGGCATCGCCGCCGCCATCCTCGCCGCC 
9 1 9 sp ATGAAAAAATACCTATTCCGCGCCGCCCTGTACGGCATCGCCGCCGCCATCCTCGCCGCC 

9Lla MKKYLFSAA 

9Lle MKKYFFRAA 

9Lld MKKYFFRAA 

9L1 f MKKYLFSAALYGIAAAILAA 

919sp MKKYLFRAALYGIAAAILAA (i.e. native signal peptide) 
SI mutants: 

9Sl-e ATGAAAAAATACCTATTC ATCGCCGCCGCCATCCTCGCCGCC 

9 SI- c AT G AAAAAAT AC C T ATT CCGAGCTGCCCAAT AC GGC ATCGCCGCCGCCATCCTCGCCGCC 

9Sl-b ATG AAAAAAT AC C T ATT CCGGGCCGCCCAAT AC GGC ATCGCCGCCGCCATCCTCGCCGCC 

9 S 1 - i ATGAAAAAATACCTATTCCGGGCGGCTTTGTACGGGATCGCCGCCGCC ATCCTCGCCGCC 

9 1 9 sp ATGAAAAAATACCTATTCCGCGCCGCCCTGTACGGC ATCGCCGCCGCCATCCTCGCCGCC 

9Sle MKKYLF IAAAILAA 

9 S 1 C MKK YLFRAAQ YG IAAAILAA 

9 S lb MKKYLFRAAQYGI AAAILAA 

9 S 1 i MKKYLFRAALYGIAAAILAA 

9 1 9 sp MKKYLFRAALYGIAAAILAA 



As shown in the sequences alignments, most of the mutants analysed contain in-frame 
deletions which were unexpectedly produced by the host cells. 

Selection of the mutants was performed by transforming E. coli BL21(DE3) cells with DNA 
prepared from a mixture of Ll and SI mutated clones. Single transformants were screened 
for high PhoC activity by streaking them onto LB plates containing 100 |ag/ml ampicillin, 
50[ig/ml methyl green, 1 mg/ml PDP (phenolphthaleindiphosphate). On this medium PhoC- 
producing cells become green (Figure 10). 
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A quantitative analysis of PhoC produced by these mutants was carried out in liquid medium 
using pNPP as a substrate for PhoC activity. The specific activities measured in cell extracts 
and supernatants of mutants grown in liquid medium for 0, 30, 90, 180 min. were: 

CELL EXTRACTS 





0 


30 


90 


180 


control 


0,00 


0,00 


0,00 


0,00 


9phoC 


1,11 


1,11 


3,33 


4,44 


9S1e 


102,12 


111,00 


149,85 


172,05 


9L1a 


206,46 


111,00 


94,35 


83,25 


9L1d 


5,11 


4,77 


4,00 


3,11 


9L1f 


27,75 


94,35 


82,14 


36,63 


9S1b 


156,51 


1 1 1 ,00 


72,15 


28,86 


9S1c 


72,15 


33,30 


21,09 


14,43 


9S1i 


156,51 


83,25 


55,50 


26,64 


phoCwt 


194,25 


180,93 


149,85 


142,08 



5 

SUPERNATANTS 





0 


30 


90 


180 


control 


0,00 


0,00 


0,00 


0,00 


9phoC 


0,33 


0,00 


0,00 


0,00 


9S1e 


0,11 


0,22 


0,44 


0,89 


9L1a 


4,88 


5,99 


5,99 


7,22 


9L1d 


0,11 


0,11 


0,11 


0,11 


9L1f 


0,11 


0,22 


0,11 


0,11 


9S1b 


1,44 


1,44 


1,44 


1,67 


9S1c 


0,44 


0,78 


0,56 


0,67 


9S1i 


0,22 


0,44 


0,22 


0,78 


phoCwt 


34,41 


43,29 


87,69 


177,60 



Some of the mutants produce high amounts of PhoC and in particular, mutant 9Lla can 
secrete PhoC in the culture medium. This is noteworthy since the signal peptide sequence of 
10 this mutant is only 9 amino acids long. This is the shortest signal peptide described to date. 



Example 21 - C-terminal deletions of Maf-related proteins 
MafB-related proteins include 730, ORF46 and ORF29. 

The 730 protein from MC58 has the following sequence: 

1 VKPLRRLTNL LAACAVAAAA LIQPALAA DL AQDPFITDNA QRQHYEPGGK 

15 51 YHLFGDPRGS VSDRTGKINV IQDYTHQMGN LLIQQANING TIGYHTRFSG 

101 HGHEEHAPFD NHAADSASEE KGNVDEGFTV YRLNWEGHEH HPADAYDGPK 

151 GGNYPKPTGA RDEYTYHVNG TARSIKLNPT DTRSIRQRIS DNYSNLGSNF 

2 01 SDRADEANRK MFEHNAKLDR WGNSMEFING VAAGALNPFI SAGEALGIGD 

251 ILYGTRYAID KAAMRNIAPL PAEGKF AVI G GLGSVAGFEK NTREAVDRWI 

20 3 01 QENPNAAETV EAVFNVAAAA KVAKLAKAAK PGKAAVSGDF ADSYKKKLAL 
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3 51 SDSARQLYQN AKYREALDIH YEDLIRRKTD GSSKFINGRE IDAVTNDALI 
401 QAKRT I SAID KPKNFLUQKN RKQIKATIEA ANQQGKRAEF WFKYGVHSQV 
451 KSYIESKGGI VKTGLGD* 

5 The leader peptide is underlined. 

730 shows similar features to ORF46 (see example 8 above): 

- as for Orf46, the conservation of the 730 sequence among MenB, Men A and gonococcus 
is high (>80%) only for the N-terminal portion. The C-terminus, from -340, is highly 
divergent. 

10 — its predicted secondary structure contains a hydrophobic segment spanning the central 
region of the molecule (aa. 227-247). 

- expression of the full-length gene in E. coli gives very low yields of protein. Expression 
from tagged or untagged constructs where the signal peptide sequence has been omitted 
has a toxic effect on the host cells. In other words, the presence of the full-length mature 

15 protein in the cytoplasm is highly toxic for the host cell while its translocation to the 

periplasm (mediated by the signal peptide) has no detectable effect on cell viability. This 
"intracellular toxicity" of 730 is particularly high since clones for expression of the 
leaderless 730 can only be obtained at very low frequency using a recA genetic 
background (2?. coli strains: HB101 for cloning; HMS 174(DE3) for expression). 

20 To overcome this toxicity, a similar approach was used for 730 as described in example 8 for 
ORF46. Four C-terminal truncated forms were obtained, each of which is well expressed. All 
were obtained from intracellular expression of His-tagged leaderless 730. 

Form A consists of the N-terminal hydrophilic region of the mature protein (aa. 28-226). 
This was purified as a soluble His-tagged product, having a higher-than-expected MW. 

25 Form B extends to the end of the region conserved between serogroups (aa. 28-340). This 
was purified as an insoluble His-tagged product. 

The C-terminal truncated forms named CI and C2 were obtained after screening for clones 
expressing high levels of 730-His clones in strain HMS174(DE3). Briefly, the pET21b 
plasmid containing the His-tagged sequence coding for the full-length mature 730 protein 
30 was used to transform the recA strain HMS174(DE3). Transformants were obtained at low 
frequency which showed two phenotypes: large colonies and very small colonies. Several 
large and small colonies were analysed for expression of the 730-His clone. Only cells from 
large colonies over-expressed a protein recognised by anti-730A antibodies. However the 
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protein over-expressed in different clones showed differences in molecular mass. 
Sequencing of two of the clones revealed that in both cases integration of an E. coli IS 
sequence had occurred within the sequence coding for the C terminal region of 730. The two 
integration events have produced in-frame fusion with 1 additional codon in the case of CI, 
5 and 12 additional codons in the case of C2 (Figure 11). The resulting "mutant" forms of 730 
have the following sequences: 

730-Cl (due to an IS1 insertion - figure 11A) 

1 MADLAQDPFI TDNAQRQHYE PGGKYHLFGD PRGSVSDRTG KINVIQDYTH 

51 QMGNLLIQQA NINGTIGYHT RFSGHGHEEH APFDNHAADS ASEEKGNVDE 

10 101 GFTVYRLNWE GHEHHPADAY DGPKGGNYPK PTGARDEYTY HVNGTARSIK 

151 LNPTDTRSIR QRISDNYSNL GSNFSDRADE ANRKMFEHNA KLDRWGNSME 

201 FINGVAAGAL NPFISAGEAL GIGDILYGTR YAIDKAAMRN IAPLPAEGKF 

2 51 AVIGGLGSVA GFEKNTREAV DRWIQENPNA AETVEAVFNV AAAAKVAKLA 

3 01 KAAKPGKAAV SGDFADSYKK KLALSDSARQ LYQNAKYREA LDIHYEDLIR 
15 351 RKTDGSSKFI NGREIDAVTN DALIQAR* 

The additional amino acid produced by the insertion is underlined. 

730-C2 (due to an IS5 insertion - Figure 11B) 

1 MADLAQDPFI TDNAQRQHYE PGGKYHLFGD PRGSVSDRTG KINVIQDYTH 
20 51 QMGNLLIQQA NINGTIGYHT RFSGHGHEEH APFDNHAADS ASEEKGNVDE 

101 GFTVYRLNWE GHEHHPADAY DGPKGGNYPK PTGARDEYTY HVNGTARS IK 
151 LNPTDTRSIR QRISDNYSNL GSNFSDRADE ANRKMFEHNA KLDRWGNSME 
2 01 FINGVAAGAL NPFISAGEAL GIGDILYGTR YAIDKAAMRN IAPLPAEGKF 
251 AVIGGLGSVA GFEKNTREAV DRWIQENPNA AETVEAVFNV AAAAKVAKLA 
25 3 01 KAAKPGKAAV SGDFADSYKK KLALSDSARQ LYQNAKYREA LGKVRISGEI 

351 LLG * 

The additional amino acids produced by the insertion are underlined. 

In conclusion, intracellular expression of the 730-Cl form gives very high level of protein 
30 and has no toxic effect on the host cells, whereas the presence of the native C-terminus is 
toxic. These data suggest that the "intracellular toxicity" of 730 is associated with the 
C-terminal 65 amino acids of the protein. 

Equivalent truncation of ORF29 to the first 231 or 368 amino acids has been performed, 
using expression with or without the leader peptide (amino acids 1-26; deletion gives 
35 cytoplasmic expression) and with or without a His-tag. 



Example 22 - domains in 961 

As described in example 9 above, the GST-fusion of 961 was the best-expressed in E.coli. 
To improve expression, the protein was divided into domains (figure 12). 

The domains of 961 were designed on the basis of YadA (an adhesin produced by Yersinia 
40 which has been demonstrated to be an adhesin localized on the bacterial surface that forms 
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oligomers that generate surface projection [Hoiczyk et al (2000) EMBO J 19:5989-99]) and 
are: leader peptide, head domain, coiled-coil region (stalk), and membrane anchor domain. 

These domains were expressed with or without the leader peptide, and optionally fused 
either to C-terminal His-tag or to N-terminal GST. E.coli clones expressing different 
5 domains of 961 were analyzed by SDS-PAGE and western blot for the production and 
localization of the expressed protein, from over-night (o/n) culture or after 3 hours induction 
with IPTG. The results were: 





Total lysate 
(Western Blot) 


Periplasm 
(Western Blot) 


Supernatant 
(Western Blot) 


OMV 
SDS-PAGE 


961 (o/n) 
961 (IPTG) 


+/- 








961-L (o/n) 
961-L (IPTG) 


+ 
+ 






+ 
+ 


961c-L (o/n) 
961c-L (IPTG) 


+ 


+ 


+ 




961A!-L (o/n) 
961Aj-L (IPTG) 


+ 






+ 



The results show that in E.coli: 



■ 961-L is highly expressed and localized on the outer membrane. By western blot analysis 
10 two specific bands have been detected: one at ~45kDa (the predicted molecular weight) and 

one at ~180kDa, indicating that 961-L can form oligomers. Additionally, these aggregates 
are more expressed in the over-night culture (without IPTG induction). OMV preparations of 
this clone were used to immunize mice and serum was obtained. Using overnight culture 
(predominantly by oligomeric form) the serum was bactericidal; the IPTG-induced culture 
15 (predominantly monomeric) was not bactericidal. 

■ 961Ai-L (with a partial deletion in the anchor region) is highly expressed and localized 
on the outer membrane, but does not form oligomers; 

■ the 961c-L (without the anchor region) is produced in soluble form and exported in the 
supernatant. 

20 Titres in ELISA and in the serum bactericidal assay using His-fusions were as follows: 





ELISA 


Bactericidal 


961a (aa 24-268) 


24397 


4096 
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961b (aa 269-405) 


7763 


64 


961c-L 


29770 


8192 


961c (2996) 


30774 


>65536 


961c (MC58) 


33437 


16384 ! 


961d 


26069 


>65536 



E.coli clones expressing different forms of 961 (961, 961-L, 961 AHL and 961c-L) were used 
to investigate if the 961 is an adhesin (c.f. YadA). An adhesion assay was performed using 
(a) the human epithelial cells and (b) E.coli clones after either over-night culture or three 
hours IPTG induction. 961-L grown over-night (961A r L) and IPTG-induced 961c-L (the 
clones expressing protein on surface) adhere to human epithelial cells. 

961c was also used in hybrid proteins (see above). As 961 and its domain variants direct 
efficient expression, they are ideally suited as the N-terminal portion of a hybrid protein. 

Example 23 - further hybrids 

Further hybrid proteins of the invention are shown below (see also Figure 14). These are 
advantageous when compared to the individual proteins: 



ORF4 6.1-741 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 



ATGTCAGATT 
GCATTTCGAA 
TTGCCGAGCG 
TTGGGCAACC 
CATTGTCCGC 
ACCATGCCTC 
TTTAGCCTTT 
CGGCTATGAC 
GGGATATATA 
AACCTGACCG 
CAATGCCGGT 
CCACCCGATA 
TTCAACGGCA 
AATTGTCGGC 
TTGCTGTCAT 
CGCATCAACG 
AGCCATCCGC 
AAGCCGTCAG 
GCTGTTCGGG 
GCGGTCGCAG 
GCGACAATTT 
TCCCGAAATA 
CACCTCCTCA 
ACCAACGCCA 
AATTTTGAGA 
CGCCGCCGAC 
ACCATAAAGA 
AAAAACGAGA 
AAACGGTGAC 
GTTTCGACTT 
GAGAGTGGAG 
CTTTCAGACC 
CGAAACGCCA 



TGGCAAACGA 
CCCGACGGGA 
CAGCGGCCAT 
TGATGATTCA 
TTTTCCGATC 
AC AT TC C GAT 
ACCGCATCCA 
GGGCCACAGG 
CAGCTACGAC 
ACAACCGCAG 
AGTATGCTGA 
CAGCCCCGAG 
CTGCAGATAT 
GCAGGCGATG 
GCACGGCTTG 
ATTTGGCAGA 
GATTGGGCAG 
CAATATCTTT 
GAAAATACGG 
ATGGGCGCGA 
TGCCGATGCG 
TCCGTTCAAA 
ACCGTGCCGC 
CCCGAAGACA 
AGCACGTGAA 
ATCGGTGCGG 
CAAAGGTTTG 
AACTGAAGCT 
AGCCTCAATA 
TATCCGCCAA 
AGTTCCAAGT 
GAGCAAATAC 
GTTCAGAATC 



TTCTTTTATC 

AATACCACCT 

ATCGGATTGG 

ACAGGCGGCC 

ACGGGCACGA 

TCTGATGAAG 

TTGGGACGGA 

GCGGCGGCTA 

ATAAAAGGCG 

C AC CGGAC AA 

CGCAAGGAGT 

CTGGACAGAT 

CGT TAAAAAC 

CCGTGCAGGG 

GGTCTGCTTT 

TATGGCGCAA 

TCCAAAACCC 

ATGGCAGCCA 

CTTGGGCGGC 

TCGCATTGCC 

GCATACGCCA 

CTTGGAGCAG 

CGTCAAACGG 

GGCGTACCGT 

ATATGATACG 

GGCTTGCCGA 

CAGTCTTTGA 

GGCGGCACAA 

CGGGCAAATT 

ATCGAAGTGG 

ATACAAACAA 

AAGATTCGGA 

GGCGACATAG 



CGGCAGGTTC 
ATTCGGCAGC 
GAAAAATACA 
ATTAAAGGAA 
AGTCCATTCC 
CCGGTAGTCC 
TACGAACACC 
TCCCGCTCCC 
TTGCCCAAAA 
CGGCTTGCCG 
AGGCGACGGA 
CGGGCAATGC 
ATCATCGGCG 
CATAAGCGAA 
CCACCGAAAA 
CTCAAAGACT 
CAATGCCGCA 
TCCCCATCAA 
ATCACGGCAC 
GAAAGGGAAA 
AATACCCGTC 
CGTTACGGCA 
CAAAAATGTC 
TTGACGGTAA 
GGATCCGGAG 
TGCACTAACC 
CGCTGGATCA 
GGTGCGGAAA 
GAAGAACGAC 
ACGGGCAGCT 
AGCCATTCCG 
GCATTCCGGG 
CGGGCGAACA 



TCGACCGTCA 
AGGGGGGAAC 
AAGCCATCAG 
AT ATCGGC TA 
CCCTTCGACA 
CGTTGACGGA 
ATCCCGCCGA 
AAAGGCGCGA 
TATCCGCCTC 
ACCGTTTCCA 
TTCAAACGCG 
CGCCGAAGCC 
CGGCAGGAGA 
GGCTCAAACA 
CAAGATGGCG 
ATGCCGCAGC 
CAAGGCATAG 
AGGGATTGGA 
ATCCTATCAA 
TCCGCCGTCA 
CCCTTACCAT 
AAGAAAACAT 
AAACTGGCAG 
AGGGTTTCCG 
GGGGTGGTGT 
GCACCGCTCG 
GTCCGTCAGG 
AAACTTATGG 
AAGGTCAGCC 
CATTACCTTG 
CCTTAACCGC 
AAGATGGTTG 
TACATCTTTT 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 



GACAAGCTTC 
TTCAGACGAT 
AGCAGGGAAA 
GACCTGGCCG 
CAGCGGTTCC 
GTATCTTTGG 
ACCGTAAACG 
CCACCACCAC 

MSDLANDSFI 
LGNLMIQQAA 
FSLYRIHWDG 
NLTDNRSTGQ 
FNGTADIVKN 
RINDLADMAQ 
AVRGKYGLGG 
SRNIRSNLEQ 
NFEKHVKYDT 
KNEKbKLAAQ 
ESGEFQVYKQ 
DKLPEGGRAT 
DLAAADIKPD 
TVNGIRHIGL 



ORF46.1-961 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 



ATGTCAGATT 
GCATTTCGAA 
TTGCCGAGCG 
TTGGGCAACC 
CATTGTCCGC 
ACCATGCCTC 
TTTAGCCTTT 
CGGCTATGAC 
GGGATATATA 
AACCTGACCG 
CAATGCCGGT 
CCACCCGATA 
TTCAACGGCA 
AATTGTCGGC 
TTGCTGTCAT 
CGCATCAACG 
AGCCATCCGC 
AAGCCGTCAG 
GCTGTTCGGG 
GCGGTCGCAG 
GCGACAATTT 
TCCCGAAATA 
CACCTCCTCA 
ACCAACGCCA 
AATTTTGAGA 
CACAAACGAC 
CCTACAACAA 
TACGACATTG 
CGATGTTGAA 
CTAACCTGAC 
GTAAAAGCTG 
C AC TGATGC C 
ACGCCTTGAA 
AAGACAAATA 
CGTCGACAAG 
AAACCAACAC 
CAGACGGCCG 
AGAAACTGCA 
CAGCCGACAA 
GATATCGCTA 
CGTGTACACC 



CCGAAGGCGG 
GCCGGCGGAA 
CGGCAAAATC 
C C GC C GAT AT 
GTCCTTTACA 
CGGAAAAGCC 
GCATACGCCA 
CACCACTGA 

RQVL DRQHFE 
IKGNIGYIVR 
YEHHPADGYD 
RLADRFHNAG 
IIGAAGEIVG 
LKDYAAAAIR 
ITAHPIKRSQ 
RYGKENITSS 
GSGGGGVAAD 
GAEKTYGNGD 
SHSALTAFQT 
YRGTAFGSDD 
GKRHAVI SGS 
AAKQL EHHHH 



TGGCAAACGA 
CCCGACGGGA 
CAGCGGCCAT 
TGATGATTCA 
TTTTCCGATC 
ACATTCCGAT 
ACCGCATCCA 
GGGCCACAGG 
CAGCTACGAC 
ACAACCGCAG 
AGTATGCTGA 
CAGCCCCGAG 
CTGCAGATAT 
GCAGGCGATG 
GCACGGCTTG 
ATTTGGCAGA 
GATTGGGCAG 
CAATATCTTT 
GAAAATACGG 
ATGGGCGCGA 
TGCCGATGCG 
TCCGTTCAAA 
ACCGTGCCGC 
CCCGAAGACA 
AGC AC GTGAA 
GACGATGTTA 
TGGCCAAGAA 
ATGAAGACGG 
GCCGACGACT 
CAAAACCGTC 
CAGAATCTGA 
GCTTTAGCAG 
TAAATTGGGA 
TCGTAAAAAT 
CATGCCGAAG 
TAAGGCAGAC 
AAGAAACCAA 
GCAGGCAAAG 
GGCCGAAGCT 
CGAACAAAGA 
AGAGAAGAGT 



CAGGGCGACA 
AACTGACCTA 
GAACATTTGA 
CAAGCCGGAT 
ACCAAGCCGA 
CAGGAAGTTG 
TATCGGCCTT 



PDGKYHLFGS 
FSDHGHEVHS 
GPQGGGYPAP 
SMLTQGVGDG 
AGDAVQGISE 
DWAVQNPNAA 
MGAIALPKGK 
TVPPSNGKNV 
IGAGLADALT 
SLNTGKLKND 
EQIQDSEHSG 
AGGKLTYTID 
VLYNQAEKGS 
HH* 



TTCTTTTATC 
AATACCACCT 
ATCGGATTGG 
ACAGGCGGCC 
ACGGGCACGA 
TC TGATGAAG 
TTGGGACGGA 
GCGGCGGCTA 
ATAAAAGGCG 
CACCGGACAA 
CGCAAGGAGT 
CTGGACAGAT 
C GTT AAAAAC 
CCGTGCAGGG 
GGTCTGCTTT 
TATGGCGCAA 
TCCAAAACCC 
ATGGCAGCCA 
CTTGGGCGGC 
TCGCATTGCC 
GCATACGCCA 
CTTGGAGCAG 
CGTCAAACGG 
GGCGTACCGT 
ATATGAT AC G 
AAAAAGCTGC 
ATCAACGGTT 
CACAATTACC 
TTAAAGGTCT 
AATGAAAACA 
AATAGAAAAG 
ATACTGATGC 
GAAAATATAA 
TG AT GAAAAA 
CATTCAACGA 
GAAGCCGTCA 
ACAAAACGTC 
CCGAAGCTGC 
GTCGCTGCAA 
TAATATTGCT 
CTGACAGCAA 



TATCGCGGGA 
CACCATAGAT 
AATCGCCAGA 
GGAAAACGCC 
GAAAGGCAGT 
CCGGCAGCGC 
GCCGCCAAGC 



RGELAERSGH 
PFDNHASHSD 
KGARD I YS YD 
FKRATRYSPE 
GSNIAVMHGL 
QGIEAVSNIF 
S AVS DNF ADA 
KLADQRHPKT 
APLDHKDKGL 
KVSRFDFIRQ 
KWAKRQFRI 
FAAKQGNGKI 
YSLGIFGGKA 



CGGCAGGTTC 
ATTCGGCAGC 
GAAAAATACA 
ATTAAAGGAA 
AGTCCATTCC 
CCGGTAGTCC 
TACGAACACC 
TCCCGCTCCC 
TTGCCCAAAA 
CGGCTTGCCG 
AGGCGACGGA 
CGGGCAATGC 
ATCATCGGCG 
CATAAGCGAA 
CCACCGAAAA 
CTCAAAGACT 
CAATGCCGCA 
TCCCCATCAA 
ATCACGGCAC 
GAAAGGGAAA 
AATACCCGTC 
CGTTACGGCA 
CAAAAATGTC 
TTGACGGTAA 
GGATC CGGAG 
CACTGTGGCC 
TCAAAGCTGG 
AAAAAAGACG 
GGGTCTGAAA 
AACAAAACGT 
TTAACAACCA 
CGCTCTGGAT 
CGACATTTGC 
TTAGAAGCCG 
TATCGCCGAT 
AAACCGCCAA 
GATGC C AAAG 
CGCTGGCACA 
AAGTTACCGA 
AAAAAAGCAA 
ATTTGTCAGA 



CGGCGTTCGG 
TTCGCCGCCA 
ACTCAATGTC 
ATGCCGTCAT 
TACTCCCTCG 
GGAAGTGAAA 
AACTCGAGCA 



IGLGKIQSHQ 
SDEAGS PVDG 
IKGVAQNIRL 
LDRS GNAAE A 
GLLSTENKMA 
MAAIPIKGIG 
AYAKYPSPYH 
GVPFDGKGFP 
QSLTLDQSVR 
IEVDGQLITL 
GDIAGEHTSF 
EHLKS PELNV 
QEVAGSAEVK 



TCGACCGTCA 
AGGGGGGAAC 
AAGC C AT C AG 
ATATCGGCTA 
CCCTTCGACA 
CGTTGACGGA 
ATCCCGCCGA 
AAAGGCGCGA 
TATCCGCCTC 
ACCGTTTCCA 
TTCAAACGCG 
CGCCGAAGCC 
CGGCAGGAGA 
GGCTCAAACA 
CAAGATGGCG 
ATGCCGCAGC 
CAAGGCATAG 
AGGGAT TGGA 
ATCCTATCAA 
TCCGCCGTCA 
CCCTTACCAT 
AAGAAAACAT 
AAACTGGCAG 
AGGGTTTCCG 
GAGGAGGAGC 
ATTGCTGCTG 
AGAGACCATC 
CAACTGCAGC 
AAAGT CGTGA 
CGATGCCAAA 
AGTTAGCAGA 
GCAACCACCA 
TGAAGAGACT 
TGGCTGATAC 
TCATTGGATG 
TG AAGC C AAA 
TAAAAGCTGC 
GCTAATAC TG 
CATCAAAGCT 
ACAGTGCCGA 
ATTGATGGTC 
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10 



15 



20 



25 



2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 

1 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 



TGAAC GCT AC 
TCCATTGCCG 
AGACCTGCGC 
CCGGTCTGTT 
GTCGGCGGCT 
CTTTACCGAA 
CCGGTTCTTC 
CACCACCACC 

MSDL AND S F I 
LGNLMIQQAA 
FSLYRIHWDG 
NLTDNRSTGQ 
FNGTADIVKN 
RINDLADMAQ 
AVRGKYGLGG 
SRNIRSNLEQ 
NFEKHVKYDT 
YD I DEDGT I T 
VKAAESEIEK 
KTNIVKIDEK 
QTAEETKQNV 
D I ATNKDNI A 
SIADHDTRLN 
VGGYKSE SAV 
HHHHHH* 



T AC CGAAAAA 
ATC AC GAT AC 
AAAGAAACCC 
CCAACCTTAC 

ACAAATCCGA 
AACTTTGCCG 

CGCAGCCTAC 
ACCACCACTG 

RQVLDRQHFE 
IKGNIGYIVR 
YEHHPADGYD 
RLADRFHNAG 
IIGAAGEIVG 
LKDYAAAAIR 
ITAHPIKRSQ 
RYGKENITSS 
GSGGGGATND 
KKDATAADVE 
LTTKLADTDA 
LEAVADTVDK 
DAKVKAAETA 
KKANSADVYT 
GLDKTVSDLR 
AIGTGFRFTE 



TTGGACACAC 
TC GC C TGAAC 
GCCAAGGCCT 
AACGTGGGTC 
ATCGGCAGTC 
CCAAAGCAGG 
CATGTCGGCG 
A 

PDGKYHLFGS 
FSDHGHEVHS 
GPQGGGYPAP 
SMLTQGVGDG 
AGDAVQGISE 
DWAVQNPNAA 
MGAIALPKGK 
TVPPSNGKNV 
DDVKKAATVA 
ADDFKGLGLK 
ALADTDAALD 
HAEAFNDIAD 
AGKAEAAAGT 
REESDSKFVR 
KETRQGLAEQ 
NFAAKAGVAV 



GCTTGGCTTC 
GGTTTGGATA 
TGCAGAACAA 
GGTTCAATGT 
GC C ATCGGT A 
CGTGGCAGTC 
TCAATTACGA 



RGELAERSGH 
PFDNHASHSD 
KGARDIYSYD 
FKRATRYSPE 
GSNIAVMHGL 
QGIEAVSNIF 
SAVSDNFADA 
KLADQRHPKT 
IAAAYNNGQE 
KWTNLTKTV 
ATTNALNKLG 
SLDETNTKAD 
ANTAADKAEA 
IDGLNATTEK 
AALSGLFQPY 
GTSSGSSAAY 



TGCTGAAAAA 
AAACAGTGTC 
GCCGCGCTCT 
AACGGCTGCA 
CCGGCTTCCG 
GGCACTTCGT 
GTGGCTCGAG 



IGLGKIQSHQ 
SDEAGSPVDG 
IKGVAQNIRL 
LDRSGNAAEA 
GLIiSTENKMA 
MAAIPIKGIG 
AYAKYPSPYH 
GVPFDGKGFP 
INGFKAGETI 
NENKQNVDAK 
ENITTFAEET 
EAVKTANEAK 
VAAKVTD I KA 
XjDTRLASAEK 
NVGRFNVTAA 
HVGVNYEWLE 



30 



35 



40 



45 



50 



55 



60 



65 



ORF46.1-961C 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 



ATGTCAGATT 
GCATTTCGAA 
TTGCCGAGCG 
TTGGGCAACC 
CATTGTCCGC 
ACCATGCCTC 
TTTAGCCTTT 
CGGCTATGAC 
GGGATATATA 
AACCTGACCG 
CAATGCCGGT 
CCACCCGATA 
TTCAACGGCA 
AATTGTCGGC 
TTGCTGTCAT 
CGCATCAACG 
AGCCATCCGC 
AAGCCGTCAG 
GCTGTTCGGG 
GCGGTCGCAG 
GCGACAATTT 
TCCCGAAATA 
CACCTCCTCA 
ACCAACGCCA 
AATTTTGAGA 
CACAAACGAC 
CCTACAACAA 
TACGACATTG 
CGATGTTGAA 
CTAACCTGAC 
GTAAAAGCTG 
CACTGATGCC 
ACGCCTTGAA 
AAGACAAATA 
CGTCGACAAG 
AAACCAACAC 
CAGACGGCCG 
AGAAACTGCA 



TGGCAAACGA 
CCCGACGGGA 
CAGCGGCCAT 
TGATGATTCA 
TTTTCCGATC 
ACATTCCGAT 
ACCGCATCCA 
GGGCCACAGG 
CAGCTACGAC 
ACAACCGCAG 
AGTATGCTGA 
CAGCCCCGAG* 
CTGCAGATAT 
GCAGGCGATG 
GCACGGCTTG 
ATTTGGCAGA 
GATTGGGCAG 
CAATATCTTT 
GAAAATACGG 
ATGGGCGCGA 
TGCCGATGCG 
TCCGTTCAAA 
ACCGTGCCGC 
C C CGAAGAC A 
AGCACGTGAA 
GACGATGTTA 
TGGCCAAGAA 
ATGAAGACGG 
GC C G AC G AC T 
CAAAACCGTC 
CAGAATCTGA 
GCTTTAGCAG 
TAAATTGGGA 
TCGTAAAAAT 
CATGCCGAAG 
TAAGGCAGAC 
AAGAAACCAA 
GCAGGCAAAG 



TTCTTTTATC 
AATACCACCT 
ATCGGATTGG 
ACAGGCGGCC 
ACGGGCACGA 
TCTGATGAAG 
TTGGGACGGA 
GCGGCGGCTA 
AT AAAAGGC G 
CACCGGACAA 
C GC AAGGAGT 
CTGGACAGAT 
CGTTAAAAAC 
CCGTGCAGGG 
GGTCTGCTTT 
TATGGCGCAA 
TCCAAAACCC 
ATGGCAGCCA 
CTTGGGCGGC 
TCGCATTGCC 
GCATACGCCA 
CTTGGAGCAG 
CGTCAAACGG 
GGCGTACCGT 
AT ATGAT AC G 
AAAAAGC TGC 
ATCAACGGTT 
CACAATTACC 
TTAAAGGTCT 
AATGAAAACA 
AATAGAAAAG 
AT AC TGATGC 
GAAAATATAA 
TGATGAAAAA 
CATTCAACGA 
GAAGCCGTCA 
ACAAAACGTC 
CCGAAGCTGC 



CGGCAGGTTC 
ATTCGGCAGC 
GAAAAATACA 
ATTAAAGGAA 
AGTCCATTCC 
CCGGTAGTCC 
T ACGAAC AC C 
TCCCGCTCCC 
TTGCCCAAAA 
CGGCTTGCCG 
AGGCGACGGA 
CGGGCAATGC 
ATCATCGGCG 
CATAAGCGAA 
CCACCGAAAA 
CTCAAAGACT 
CAATGCCGCA 
TCCCCATCAA 
ATCACGGCAC 
GAAAGGGAAA 
AATACCCGTC 
CGTTACGGCA 
CAAAAATGTC 
TT GAC GGTAA 
GGATCCGGAG 
CACTGTGGCC 
TCAAAGCTGG 
AAAAAAGACG 
GGGTCTGAAA 
AACAAAACGT 
TTAACAACCA 
CGCTCTGGAT 
CGACATTTGC 
TTAGAAGCCG 
TATCGCCGAT 
AAACCGCCAA 
GATGCCAAAG 
CGCTGGCACA 



TCGACCGTCA 
AGGGGGGAAC 
AAGCCATCAG 
ATATCGGCTA 
CCCTTCGACA 
CGTTGACGGA 
ATCCCGCCGA 
AAAGGCGCGA 
TATCCGCCTC 
ACCGTTTCCA 
TTCAAACGCG 
CGCCGAAGCC 
CGGCAGGAGA 
GGCTCAAACA 
CAAGATGGCG 
ATGCCGCAGC 
CAAGGCATAG 
AGGGATTGGA 
ATCCTATCAA 
TCCGCCGTCA 
CCCTTACCAT 
AAGAAAACAT 
AAACTGGCAG 
AGGGTTTCCG 
GAGGAGGAGC 
ATTGCTGCTG 
AGAG AC C ATC 
CAACTGCAGC 
AAAGTCGTGA 
CGATGCCAAA 
AGTTAGCAGA 
GCAACCACCA 
TGAAGAGACT 
TGGCTGATAC 
TCATTGGATG 
TGAAGCCAAA 
TAAAAGCTGC 
GCTAATACTG 
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1901 CAGCCGACAA GGCCGAAGCT GTCGCTGCAA AAGTTACCGA CATCAAAGCT 

1951 GATATCGC T A CGAACAAAGA TAATATTGCT AAAAAAGCAA ACAGTGCCGA 

2001 CGTGTACACC AGAGAAGAGT CTGACAGCAA ATTTGTCAGA ATTGATGGTC 

2 051 TGAACGCTAC TAC CGAAAAA TTGGACACAC GCTTGGCTTC TGCTGAAAAA 

2101 TCCATTGCCG AT C AC GAT AC TCGCCTGAAC GGTTTGGATA AAACAGTGTC 

2151 AGACCTGCGC AAAGAAACCC GCCAAGGCCT TGCAGAACAA GCCGCGCTCT 

2201 CCGGTCTGTT CCAACCTTAC AACGTGGGTC TCGAGCACCA CCACCACCAC 

2251 CACTGA 

1 MSDLANDSFI RQVLDRQHFE PDGKYHLFGS RGELAERSGH IGLGKIQSHQ 

51 LGNLMIQQAA IKGNIGYIVR FSDHGHEVHS PFDNHASHSD SDE AGS P VDG 

101 FSKYRIHWDG YEHHPADGYD GPQGGGYPAP KGARD I Y S YD IKGVAQNIRL 

151 NLTDNRSTGQ RLADRFHNAG SMLTQGVGDG FKRATRYSPE LDRSGNAAEA 

201 FNGTADIVKN I X GAAGE I VG AGDAVQG I S E GSNIAVMHGL GLLSTENKMA 

251 RINDLADMAQ LKDYAAAA I R DWAVQNPNAA QGIEAVSNIF MAAIPIKGIG 

301 AVRGKYGLGG ITAHPIKRSQ MGAIALPKGK S AVS DNF ADA AYAKYPSPYH 

351 SRNIRSNLiEQ RYGKENITSS TVPPSNGKNV KLADQRHPKT GVPFDGKGFP 

401 NFEKHVKYDT GSGGGGATND DDVKKAATVA IAAAYNNGQE INGFKAGETI 

451 YDIDEDGTIT KKDATAADVE ADDFKGLGLK KWTNLTKTV NENKQNVDAK 

501 VKAAESEIEK LTTKLADTDA ALADTDAALD ATTNALNKLG ENITTFAEET 

551 KTNIVKIDEK LEAVADTVDK HAEAFND IAD SLDETNTKAD EAVKTANEAK 

601 QTAEETKQNV DAKVKAAETA AGKAEAAAGT ANTAADKAEA VAAKVTDIKA 

651 DIATNKDNIA KKANSADVYT REESDSKFVR IDGLNATTEK LDTRLASAEK 

701 SIADHDTRLN GLDKTVSDLR KETRQGLAEQ AALSGLFQPY NVGLEHHHHH 

751 H* 



961-ORF4 6.1 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 



ATGGCCACAA 
TGCTGCCTAC 
CCATCTACGA 
GCAGCCGATG 
CGTGACTAAC 
C C AAAGT AAA 
GCAGACACTG 
CACCAACGCC 
AG AC T AAG AC 
GATACCGTCG 
GGATGAAAC C 
CCAAACAGAC 
GCTGCAGAAA 
TAC TGCAGC C 
AAGCTGATAT 
GCCGACGTGT 
TGGTCTGAAC 
AAAAATC CAT 
GTGTCAGACC 
GCTCTCCGGT 
CTGCAGTCGG 
TTCCGCTTTA 
TTC GTCCGGT 
GATCCGGAGG 
GTTCTCGACC 
CAGCAGGGGG 
TACAAAGCCA 
GGAAATATCG 
TTCCCCCTTC 
GTC CCGTTGA 
CACCATCCCG 
TCCCAAAGGC 
AAAATATCCG 
GCCGACCGTT 
CGGATTCAAA 
ATGCCGCCGA 
GGCGCGGCAG 
CGAAGGCTCA 
AAAACAAGAT 



AC GAC GACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
AAATATCGTA 
ACAAGCATGC 
AACACTAAGG 
GGCCGAAGAA 
CTGCAGCAGG 
GACAAGGCCG 
CGCTACGAAC 
ACACCAGAGA 
GCTACTACCG 
TGCCGATCAC 
TGCGCAAAGA 
CTGTTCCAAC 
CGGCTACAAA 
CCGAAAACTT 
TCTTCCGCAG 
AGGAGGATCA 
GTCAGCATTT 
GAACTTGCCG 
TCAGTTGGGC 
GCTACATTGT 
GACAACCATG 
CGGATTTAGC 
CCGACGGCTA 
GCGAGGGATA 
CCTCAACCTG 
TCCACAATGC 
CGCGCCACCC 
AGCCTTCAAC 
GAGAAATTGT 
AACATTGCTG 
GGCGCGCATC 



TGTTAAAAAA 
AAGAAATCAA 
GACGGCACAA 
CGACTTTAAA 
CCGTCAATGA 
T CTGAAAT AG 
AGCAGATACT 
TGGGAGAAAA 
AAAATTGATG 
CGAAGCATTC 
CAGACGAAGC 
ACCAAACAAA 
CAAAGCCGAA 
AAGC TGTCGC 
AAAGATAATA 
AGAGTCTGAC 
AAAAATT GGA 
GATAC TCGCC 
AACCCGCCAA 
CTTACAACGT 
TCCGAATCGG 
TGCCGCCAAA 
CCTACCATGT 
GATTTGGCAA 
CGAACCCGAC 
AGCGCAGCGG 
AACCTGATGA 
CCGCTTTTCC 
CCTCACATTC 
CTTTACCGCA 
TGACGGGCCA 
TATACAGCTA 
ACCGACAACC 
CGGTAGTATG 
GATACAGCCC 
GGCACTGCAG 
CGGCGCAGGC 
TCATGCACGG 
AACGATTTGG 



GCTGCCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 
TATAACGACA 
AAAAAT TAG A 
AACGATATCG 
CGTCAAAACC 
ACGTCGATGC 
GCTGCCGCTG 
TGCAAAAGTT 
TTGCTAAAAA 
AGCAAATTTG 
CACACGCTTG 
TGAACGGTTT 
GGCCTTGCAG 
GGGTCGGTTC 
CAGTCGCCAT 
GCAGGCGTGG 
CGGCGTCAAT 
ACGATTCTTT 
GGGAAAT AC C 
C CAT AT C GGA 
TTCAACAGGC 
GAT C ACGGGC 
CGATTCTGAT 
TCCATTGGGA 
CAGGGCGGCG 
CGACATAAAA 
GCAGCACCGG 
CTGACGCAAG 
CGAGCTGGAC 
AT AT CGTTAA 
GATGCCGTGC 
CTTGGGTCTG 
CAGATATGGC 



TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 
TTTGCTGAAG 
AGCCGTGGCT 
CCGATTCATT 
GCCAATGAAG 
CAAAGTAAAA 
GCACAGCTAA 
AC C GAC AT C A 
AGCAAACAGT 
TCAGAATTGA 
GCTTCTGCTG 
GGATAAAACA 
AACAAGCCGC 
AATGTAACGG 
CGGTACCGGC 
CAGTCGGCAC 
TACGAGTGGG 
TATCCGGCAG 
ACCTATTCGG 
TTGGGAAAAA 
GGC CAT TAAA 
AC GAAGTC C A 
GAAGCCGGTA 
CGGATACGAA 
GCTATCCCGC 
GGCGTTGCCC 
ACAACGGCTT 
GAGTAGGCGA 
AGATCGGGCA 
AAACATCATC 
AGGGCATAAG 
CTTTCCACCG 
G C AAC TC AAA 
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1951 GACTATGCCG CAGCAGCCAT CCGCGATTGG GCAGTCCAAA ACCCCAATGC 

2 001 CGCACAAGGC ATAGAAGCCG TCAGCAATAT CTTTATGGCA GCCATCCCCA 

2 051 TCAAAGGGAT TGGAGCTGTT CGGGGAAAAT ACGGCTTGGG CGGCATCACG 

2101 GCACATCCTA TCAAGCGGTC GCAGATGGGC GCGATCGCAT TGCCGAAAGG 

5 2151 GAAATCCGCC GTCAGCGACA ATTTTGCCGA TGCGGCATAC GCCAAATACC 

22 01 CGTCCCCTTA CCATTCCCGA AATATCCGTT CAAACTTGGA GCAGCGTTAC 
2251 GGCAAAGAAA ACATCACCTC CTCAACCGTG CCGCCGTCAA ACGGCAAAAA 

23 01 TGTCAAACTG GCAGACCAAC GCCACCCGAA GACAGGCGTA CCGTTTGACG 
23 51 GTAAAGGGTT TCCGAATTTT GAGAAGCACG TGAAATATGA T AC GC TC GAG 

10 2401 CACCACCACC ACCACCACTG A 

1 MATNDDDVKK AAT V A I AAA Y NNGQEINGFK AGETIYDIDE DGTITKKDAT 

51 AADVEADDFK GL GLKKWTN L TKTVNENKQ NVDAKVKAAE SEIEKLTTKL 

101 ADTDAALADT DAALDATTNA LNKLGENITT FAEETKTNIV KIDEKLEAVA 

15 151 DTVDKHAEAF NDIADSLDET NTKADEAVKT ANEAKQTAEE TKQNVDAKVK 

2 01 AAETAAGKAE AAAGTANTAA DKAEAVAAKV TDIKADIATN KDNIAKKANS 

2 51 ADVYTREESD SKFVRIDGHN ATTEKLDTRL ASAEKSIADH DTRLNGLDKT 
301 VSDLRKETRQ GLAEQAALSG LFQPYNVGRF NVTAAVGGYK SESAVAIGTG 

3 51 FRFTENFAAK AGVAVGTSSG SSAAYHVGVN YEWGSGGGGS DLANDSFIRQ 
20 401 VLDRQHFEPD GKYHLFGSRG ELAERSGHIG LGKIQSHQLG NLMIQQAAIK 

451 GNIGYIVRFS DHGHEVHSPF DNHASHSDSD EAGSPVDGFS LYRIHWDGYE 

501 HHPADGYDGP QGGGYPAPKG ARDIYSYDIK GVAQNIRLNL TDNRSTGQRL 

551 ADRFHNAGSM LTQGVGDGFK RATRYSPELD RSGNAAEAFN GTADIVKNII 

601 GAAGE I VGAG DAVQGISEGS NIAVMHGLGL LSTENKMARI NDLADMAQLK 

25 651 DYAAAAIRDW AVQNPNAAQG IEAVSNIFMA AIPIKGIGAV RGKYGLGGIT 

7 01 AHPIKRSQMG AIALPKGKSA VSDNFADAAY AKYPSPYHSR NIRSWLEQRY 

7 51 GKENITSSTV PPSNGKNVKL ADQRHPKTGV PFDGKGFPNF EKHVKYDTLE 
801 HHHHHH* 

30 961-741 

1 ATGGCCACAA ACGACGACGA TGTTAAAAAA GCTGCCACTG TGGCCATTGC 

51 TGCTGC CTAC AACAATGGCC AAGAAATCAA CGGTTTCAAA GCTGGAGAGA 

101 CCATCTACGA C AT TGATGAA GACGGCACAA TTAC CAAAAA AG AC GC AAC T 

151 GCAGCCGATG TTGAAGCCGA CGACTTTAAA GGTCTGGGTC TGAAAAAAGT 

35 2 01 CGTGACTAAC CTGACCAAAA CCGTCAATGA AAACAAACAA AACGTCGATG 

2 51 CCAAAGTAAA AGCTGCAGAA TCTGAAATAG AAAAGT T AAC AACCAAGTTA 

3 01 GCAGACACTG ATGCCGCTTT AGCAGATACT GATGCCGCTC TGGATGCAAC 
3 51 CACCAACGCC TTGAATAAAT TGGGAGAAAA TATAACGACA TTTGCTGAAG 
401 AGACTAAGAC AAATATCGTA AAAATTGATG AAAAAT TAG A AGCCGTGGCT 

40 451 GATACCGTCG ACAAGCATGC CGAAGCATTC AACGATATCG CCGATTCATT 

5 01 GGATG AAAC C AACACTAAGG CAGACGAAGC CGTCAAAACC GC C AATGAAG 

551 CCAAACAGAC GGCCGAAGAA AC C AAAC AAA ACGTCGATGC CAAAGTAAAA 

601 GCTGCAGAAA CTGCAGCAGG CAAAGCCGAA GCTGCCGCTG GCACAGCTAA 

651 TACTGCAGCC GACAAGGCCG AAGCTGTCGC TGCAAAAGTT ACCGACATCA 

45 7 01 AAGCTGATAT CGCTACGAAC AAAGATAATA TTGCTAAAAA AGCAAACAGT 

751 GCCGACGTGT ACACCAGAGA AGAGTCTGAC AGCAAATTTG TCAGAATTGA 

8 01 TGGTCTGAAC GCTACTACCG AAAAAT T GGA CACACGCTTG GCTTCTGCTG 
851 AAAAAT C CAT TGCCGATCAC GATACTCGCC TGAACGGTTT GGATAAAACA 

9 01 GT GT C AG AC C TGCGCAAAGA AACCCGCCAA GGCCTTGCAG AACAAGCCGC 
50 9 51 GCTCTCCGGT CTGTTCCAAC CTTACAACGT GGGTCGGTTC AATGTAACGG 

1001 CTGCAGTCGG CGGCTACAAA TCCGAATCGG CAGTCGCCAT CGGTACCGGC 

1051 TTCCGCTTTA CCGAAAACTT TGCCGCCAAA GCAGGCGTGG CAGTCGGCAC 

1101 TTCGTCCGGT TCTTCCGCAG CCTACCATGT CGGCGTCAAT TACGAGTGGG 

1151 GATCCGGAGG GGGTGGTGTC GCCGCCGACA TCGGTGCGGG GCTTGCCGAT 

55 1201 GC AC T AAC C G CACCGCTCGA C C AT AAAG AC AAAGGTTTGC AGTCTTTGAC 

1251 GCTGGATCAG TCCGTCAGGA AAAACGAGAA AC TGAAGC TG GCGGCACAAG 

13 01 GTGCGGAAAA AACTTATGGA AACGGTGACA GCCTCAATAC GGGCAAATTG 

1351 AAGAACGACA AGGTCAGCCG TTTCGACTTT ATCCGCCAAA TCGAAGTGGA 

1401 CGGGCAGCTC ATTACCTTGG AGAGTGGAGA GTTCCAAGTA TACAAACAAA 

60 1451 GCCATTCCGC CTTAACCGCC TTTCAGACCG AGCAAATACA AGATTCGGAG 

1501 CATTCCGGGA AGATGGTTGC GAAAC GC C AG TTCAGAATCG GCGACATAGC 

1551 GGGC GAAC AT ACATCTTTTG ACAAGCTTCC CGAAGGCGGC AGGGCGACAT 

1601 ATCGCGGGAC GGCGTTCGGT TCAGACGATG CCGGCGGAAA AC TGACCT AC 

1651 AC C ATAGATT TCGCCGCCAA GCAGGGAAAC GGCAAAATCG AACATTTGAA 

65 1701 ATCGCCAGAA CTCAATGTCG ACCTGGCCGC CGCCGATATC AAGCCGGATG 

1751 GAAAAC GC C A TGCCGTCATC AGCGGTTCCG TCCTTTACAA CCAAGCCGAG 

1801 AAAGGCAGTT ACTCCCTCGG TATCTTTGGC GGAAAAGCCC AGGAAGTTGC 
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1851 CGGCAGCGCG GAAGTGAAAA CCGTAAACGG C ATACGC CAT ATCGGCCTTG 

19 01 CCGCCAAGCA ACTCGAGCAC CACCACCACC ACCACTGA 

1 MATNDDDVKK AATVAIAAAY NNGQEINGFK AGETIYDIDE DGTITKKDAT 

5 51 AADVEADDFK GLGLKKWTN LTKTVNENKQ NVDAKVKAAE SE I EKLTTKL 

101 ADTDAALADT DAALDATTNA LNKLGENITT FAEETKTNIV KIDEKLEAVA 

151 DTVDKHAEAF NDIADSLDET NTKADEAVKT ANEAKQTAEE TKQNVDAKVK 

201 AAETAAGKAE AAAGTANTAA DKAEAVAAKV TDIKADIATN KDNIAKKANS 

2 51 ADVYTREESD SKFVRIDGLN ATTEKLDTRL ASAEKSIADH DTRLNGLDKT 
10 3 01 VSDLRKETRQ GLAEQAALSG LFQPYNVGRF NVTAAVGGYK SESAVAIGTG 

3 51 FRFTENFAAK AGVAVGTSSG SSAAYHVGVN YEWGSGGGGV AADIGAGLAD 
401 ALTAPLDHKD KGLQSLTLDQ SVRKNEKLKL AAQGAEKTYG NGDSLNTGKL 
451 KNDKVSRFDF IRQIEVDGQL ITLESGEFQV YKQSHSALTA FQTEQIQDSE 
501 H S GKMVAKRQ FRIGDIAGEH TSFDKLPEGG RATYRGTAFG S DDAGGKL T Y 

15 551 TIDFAAKQGN GKIEHLKSPE LNVDLAAADI KPDGKRHAVI SGSVLYNQAE 

601 KGSYSLGIFG GKAQEVAGSA EVKTVNGIRH IGLAAKQLEH HHHHH* 

961-983 

20 1 ATGGCCACAA AC G AC GAG G A TGTTAAAAAA GCTGCCACTG TGGCCATTGC 

51 TGCTGCCTAC AACAATGGCC AAGAAATCAA C GGT TTC AAA GCTGGAGAGA 

101 C CATC TAG GA C AT TGATGAA GACGGCACAA TTACCAAAAA AGACGCAACT 

151 GCAGCCGATG TTGAAGCCGA CGACTTTAAA GGTCTGGGTC TGAAAAAAGT 

2 01 CGTGACTAAC CTGACCAAAA CCGTCAATGA AAACAAACAA AACGTCGATG 
25 2 51 CCAAAGTAAA AGCTGCAGAA TC TGAAAT AG AAAAGTTAAC AACCAAGTTA 

3 01 GCAGACACTG ATGCCGCTTT AGCAGATACT GATGCCGCTC TGGATGCAAC 
3 51 CACCAACGCC TTGAATAAAT TGGGAGAAAA TATAACGACA TTTGCTGAAG 
401 AGACTAAGAC AAATATCGTA AAAATTGATG AAAAATTAGA AGCCGTGGCT 
451 GATACCGTCG ACAAGCATGC CGAAGCATTC AACGATATCG CCGATTCATT 

30 501 GGATGAAACC AACACTAAGG CAGACGAAGC CGTCAAAACC GCCAATGAAG 

551 CCAAACAGAC GGCCGAAGAA ACCAAACAAA ACGTCGATGC CAAAGTAAAA 

601 GCTGCAGAAA CTGCAGCAGG CAAAGCCGAA GCTGCCGCTG GCACAGCTAA 

651 TACTGCAGCC GACAAGGCCG AAGCTGTCGC TGCAAAAGTT ACCGACATCA 

701 AAGCTGATAT CGCTACGAAC AAAGATAATA TTGCTAAAAA AGCAAACAGT 

35 751 GCCGACGTGT ACACCAGAGA AGAGTCTGAC AGCAAATTTG TCAGAATTGA 

801 TGGTCTGAAC GCTACTACCG AAAAATTGGA CACACGCTTG GCTTCTGCTG 

851 AAAAATCCAT TGCCGATCAC GAT AC TC GC C TGAACGGTTT GGATAAAACA 

901 GTGTCAGACC TGCGCAAAGA AACCCGCCAA GGCCTTGCAG AACAAGCCGC 

951 GCTCTCCGGT CTGTTCCAAC CTTACAACGT GGGTCGGTTC AATGTAACGG 

40 1001 CTGCAGTCGG C GGC T AC AAA TCCGAATCGG CAGTCGCCAT CGGTACCGGC 

1051 TTCCGCTTTA CCGAAAACTT TGCCGCCAAA GCAGGCGTGG CAGTCGGCAC 

1101 TTCGTCCGGT TCTTCCGCAG CCTACCATGT CGGCGTCAAT T AC GAGTGGG 

1151 GATCCGGCGG AGGCGGCACT TCTGCGCCCG ACTTCAATGC AGGCGGTACC 

1201 GGTATCGGCA GCAACAGCAG AGCAACAACA GC GAAAT C AG CAGCAGTATC 

45 1251 TTACGCCGGT ATCAAGAACG AAATGTGCAA AGACAGAAGC ATGCTCTGTG 

13 01 CCGGTCGGGA TGACGTTGCG GT T AC AG AC A GGGATGCCAA AATCAATGCC 

1351 CCCCCCCCGA AT C TGC AT AC CGGAGACTTT CCAAACCCAA ATGACGCATA 

1401 CAAGAATTTG ATCAACCTCA AACCTGCAAT TGAAGCAGGC TATACAGGAC 

1451 GCGGGGTAGA GGTAGGTATC GTCGACACAG GCGAATCCGT CGGCAGCATA 

50 1501 TCCTTTCCCG AACTGTATGG CAGAAAAGAA C ACGGC T AT A AC G AAAAT T A 

1551 CAAAAACTAT ACGGCGTATA TGCGGAAGGA AGCGCCTGAA GACGGAGGCG 

1601 GTAAAGACAT TGAAGCTTCT TTCGACGATG AGGC CGTTAT AGAGACTGAA 

1651 GCAAAGCCGA CGGATATCCG C C AC GTAAAA GAAAT C GGAC AC AT C GATTT 

17 01 GGTCTCCCAT ATTATTGGCG GGCGTTCCGT GGACGGCAGA CCTGCAGGCG 

55 1751 GTATTGCGCC CGATGC G AC G C T AC AC AT AA TGAATACGAA TGATGAAACC 

1801 AAGAACGAAA TGATGGTTGC AGCCATCCGC AATGCATGGG TCAAGCTGGG 

1851 CGAACGTGGC GTGCGCATCG TCAATAACAG TTTTGGAACA ACATCGAGGG 

1901 C AGGC AC TGC CGACCTTTTC CAAATAGCCA ATTCGGAGGA GCAGTACCGC 

1951 CAAGCGTTGC TCGACTATTC CGGCGGTGAT AAAACAGACG AGGGTATCCG 

60 2 001 CCTGATGCAA CAGAGCGATT ACGGCAACCT GTCCTACCAC ATCCGTAATA 

2 051 AAAACATGCT TTTCATCTTT TC GAC AGGC A ATGACGCACA AGCTCAGCCC 

2101 AACACATATG CCCTATTGCC ATTTTATGAA AAAGACGCTC AAAAAGGCAT 

2151 TATCACAGTC GCAGGCGTAG ACCGCAGTGG AGAAAAGTTC AAACGGGAAA 

22 01 TGTATGGAGA AC CGGGTAC A GAACCGCTTG AGTATGGC TC CAACCATTGC 
65 2251 GGAATTACTG CCATGTGGTG CCTGTCGGCA CCCTATGAAG CAAGCGTCCG 

23 01 TTTCACCCGT ACAAACCCGA TTCAAATTGC CGGAACATCC TTTTCCGCAC 
23 51 CCATCGTAAC CGGCACGGCG GCTCTGCTGC TGCAGAAATA CCCGTGGATG 
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2401 
2451 
2501 
2551 
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2701 
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3801 
3851 
3901 
3951 
4001 
4051 
4101 
4151 
4201 
4251 
4301 

1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 



AGCAACGACA 
TGCAGTCGGC 
AGGCCATGAA 
AC GAAAGGTA 
CACGGGCGGC 
ACAACACCTA 
TACGGCAACA 
TTATAACGGG 
ATCTGGCAGA 
GGCAGTCTGC 
ACTGCTGAAA 
CGGCACGCGG 
CCCTTCCTGA 
CAT C GAAAC C 
CAGCGGGCAG 
GCGGCACGGA 
ACACGCCGTA 
TGGATGCCTC 
GCCGACCGCA 
CGCAGCGGCA 
TCAACAGTCT 
GATATGCAGG 
CGGCACGGGT 
GGGAACAGGG 
GGCATTGCCG 
CATGGGACGC 
GCATTAGTCT 
CTCAAAGGCC 
CACCGGTGCG 
AGCTGGGCGC 
TTGACGGTCG 
CGCCGAAAAA 
GCACGCTGGT 
AAAGCCGTCC 
CGACTACACG 
AGACGGGGGC 
GCGGATGTCG 
CGCCGGTTCC 
ACCGGTTCCT 

MATNDDDVKK 
AADVEADDFK 
ADTDAALADT 
DTVDKHAEAF 
AAETAAGKAE 
ADVYTREESD 
VSDLRKETRQ 
FRFTENFAAK 
GIGSNSRATT 
PPPNLHTGDF 
S F PEL YGRKE 
AKPTDIRHVK 
KNEMMVAAIR 
QALLDYSGGD 
NTYALLPFYE 
GITAMWCLSA 
SNDNLRTTLL 
TKGTSDIAYS 
YGHTWK S DMRV 
GSLQLDGKGT 
PFLSAAKIGQ 
AARTASAAAH 
ADRTDMPGIR 
DMQGRRLKAV 
GIAAKTGENT 
LKGLFSYGRY 
LTVEGGLRYD 



ACCTGCGTAC 
GTGGACAGCA 
CGGACCCGCG 
CATCCGATAT 
CTGATCAAAA 
TACGGGCAAA 
AC AAAT CGGA 
GCGGCATCCG 
TACCGACCAA 
AGCTGGACGG 
GTGGACGGTA 
CAAGGGGGCA 
GTGCCGCCAA 
GACGGCGGCC 
TGAAGGCGAC 
CTGCTTCGGC 
GAACAGGGCG 
CGAATCATCC 
CAGATATGCC 
GCCGTACAGC 
CGCCGCTACC 
GACGCCGCCT 
CTGCGCGTCA 
CGGTGTTGAA 
CGAAAACCGG 
AGCACATGGA 
GTTTGCAGGC 
TGTTCTCCTA 
GACGAACATG 
ACTGGGCGGT 
AAGGCGGTCT 
GGCAGTGCTT 
CGGACTCGCG 
TGTTTGCAAC 
GTAACGGGCG 
ACGCAATATG 
AATTCGGCAA 
AAACAGTACG 
CGAGCACCAC 

AATVAIAAAY 
GLGLKKWTN 
DAALDATTNA 
NDIADSLDET 
AAAGTANTAA 
SKFVRIDGLN 
GL AEQAAL S G 
AGVAVGTSSG 
AKSAAVSYAG 
PNPNDAYKNL 
HGYNENYKNY 
EIGHIDLVSH 
NAWVKLGERG 
KTDEGIRLMQ 
KDAQKG 1 1 TV 
PYEASVRFTR 
TTAQDIGAVG 
FRNDISGTGG 
ETKGALIYNG 
LYTRLGKLLK 
DYSFFTNIET 
SAPAGLKHAV 
PYGATFRAAA 
SDGLDHNGTG 
TAAATLGMGR 
KNSISRSTGA 
LLKQDAFAEK 



CACGTTGCTG 
AGTTCGGCTG 
TCCTTTCCGT 
TGCCTACTCC 
AAGGCGGCAG 
AC C ATTATC G 
TATGCGCGTC 
GCGGCAGCCT 
TCCGGCGCAA 
CAAAGGTACG 
C GGCGAT TAT 
GGCTATCTCA 
AATCGGGCAG 
TGCTGGCTTC 
ACGCTGTCCT 
AGCGGCACAT 
GCAGCAATCT 
GCAACACCCG 
GGGCATCCGC 
ATGCGAATGC 
GTCTATGCCG 
GAAAGCCGTA 
TCGCGCAAAC 
GGCAAAATGC 
CGAAAATACG 
GCGAAAACAG 
ATACGGCACG 
CGGACGCTAC 
CGGAAGGCAG 
GTCAACGTTC 
GCGCTACGAC 
TGGGCTGGAG 
GGTCTGAAGC 
GGCGGGCGTG 
GCTTTACCGG 
CCGCACACCC 
CGGCTGGAAC 
GCAACCACAG 
CACCACCACC 

NNGQEINGFK 
LTKTVNENKQ 
LNKLGENITT 
NTKADEAVKT 
DKAEAVAAKV 
ATTEKLDTRL 
LFQPYNVGRF 
SSAAYHVGVN 
IKNEMCKDRS 
INLKPAI E AG 
TAYMRKEAPE 
IIGGRSVDGR 
VRIVNNSFGT 
QSDYGNLSYH 
AGVDRSGEKF 
TNPIQIAGTS 
VDSKFGWGLL 
LIKKGGSQLQ 
AASGGSLNSD 
VDGTAIIGGK 
DGGLLASLDS 
EQGGSNLENL 
AVQHANAADG 
LRVIAQTQQD 
STWSENSANA 
DEHAEGSVNG 
GSALGWSGNS 



ACGACGGCTC 
GGGACTGCTG 
TCGGCGACTT 
TTCCGTAACG 
CCAACTGCAA 
AAGGCGGTTC 
GAAAC C AAAG 
GAACAGCGAC 
AC GAAAC CGT 
CTGTACACAC 
CGGCGGCAAG 
ACAGTACCGG 
GATTATTCTT 
CCTCGACAGC 
ATTATGTCCG 
TCCGCGCCCG 
GGAAAAC CTG 
AG ACGGT TGA 
CCCTACGGCG 
CGCCGACGGT 
ACAGTACCGC 
TCGGACGGGT 
CCAACAGGAC 
GCGGCAGTAC 
ACAGCAGCCG 
TGCAAATGCA 
ATGCGGGCGA 
AAAAACAGCA 
CGTCAACGGC 
CGTTTGCCGC 
CTGCTCAAAC 
CGGCAACAGC 
TGTCGCAACC 
GAACGCGACC 
CGCGACTGCA 
GTCTGGTTGC 
GGCTTGGCAC 
CGGACGAGTC 
AC TGA 

AGET I YD IDE 
NVDAKVKAAE 
F AEETKTN IV 
ANEAKQTAEE 
TDIKADIATN 
ASAEKS IADH 
NVTAAVGGYK 
YEWGSGGGGT 
ML C AGRDDVA 
YTGRGVEVG I 
DGGGKDIEAS 
PAGG I APDAT 
TSRAGTADLF 
IRNKNMLFIF 
KREMYGE PGT 
F SAP I VTGTA 
DAGKAMNGPA 
LHGNNTYTGK 
GIVYLADTDQ 
LYMSARGKGA 
VEKTAGSEGD 
MVELDASESS 
VRIFNSLAAT 
GGTWEQGGVE 
KTDSISLFAG 
TLMQLGADGG 
LTEGTLVGLA 



AGGACATCGG 
GATGCGGGTA 
TACCGCCGAT 
ACATTTCAGG 
CTGCACGGCA 
GCTGGTGTTG 
GTGCGCTGAT 
GGCATTGTCT 
AC AC AT C AAA 
GTTTGGGCAA 
CTGTACATGT 
AC GAC GTGTT 
TCTTCACAAA 
GTCGAAAAAA 
TCGCGGCAAT 
CCGGTCTGAA 
ATGGTCGAAC 
AACTGCGGCA 
CAACTTTCCG 
GTACGCATCT 
CGCCCATGCC 
TGGAC C AC AA 
GGTGGAACGT 
CCAAACCGTC 
CCACACTGGG 
AAAACCGACA 
TATCGGCTAT 
TCAGCCGCAG 
ACGCTGATGC 
AACGGGAGAT 
AGGATGC AT T 
CTCACTGAAG 
CTTGAGCGAT 
TGAACGGACG 
GCAACCGGCA 
CGGCCTGGGC 
GTTACAGCTA 
GGCGTAGGCT 



DGTITKKDAT 
SEIEKLTTKL 
KIDEKLEAVA 
TKQNVDAKVK 
KDNIAKKANS 
DTRLNGLDKT 
SESAVAIGTG 
SAPDFNAGGT 
VTDRDAKINA 
VDTGESVGSI 
FDDEAVIETE 
LHIMNTNDET 
QIANSEEQYR 
STGNDAQAQP 
EPLEYGSNHC 
ALLLQKYPWM 
SFPFGDFTAD 
TIIEGGSLVL 
SGANETVHIK 
GYLNS TGRRV 
TLSYYVRRGN 
ATPETVETAA 
VYADSTAAHA 
GKMRGSTQTV 
IRHDAGDIGY 
VNVPFAATGD 
GLKLSQPLSD 
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-TO- 
iss i KAVLFATAGV ERDLNGRDYT VTGGFTGATA ATGKTGARNM PHTRLVAGLG 
1401 ADVEFGNGWN GLARYSYAGS KQYG1SIHSGRV GVGYRFLEHH HHHH* 

5 961C-ORF46.1 

1 ATGGCCACAA AC GACGAC G A TGTTAAAAAA GCTGCCACTG TGGCCATTGC 

51 TGCTGCCTAC AACAATGGCC AAGAAATCAA CGGTTTCAAA GCTGGAGAGA 

101 C CATC T AC GA CATTGATGAA GACGGCACAA TTACCAAAAA AGACGCAACT 

151 GCAGCCGATG TTGAAGCCGA CGACTTTAAA GGTCTGGGTC TGAAAAAAGT 

10 2 01 CGTGACTAAC CTGACCAAAA CCGTCAATGA AAACAAACAA AACGTCGATG 

251 C C AAAGT AAA AGCTGCAGAA TCTGAAATAG AAAAGTTAAC AACCAAGTTA 

3 01 GCAGACACTG ATGCCGCTTT AGCAGATACT GATGCCGCTC TGGATGCAAC 

351 CACCAACGCC TTGAATAAAT T GGGAGAAAA TATAACGACA TTTGCTGAAG 

401 AGACTAAGAC AAATATCGTA AAAATTGATG AAAAATTAGA AGCCGTGGCT 

15 451 GAT AC C GT C G ACAAGCATGC CGAAGC ATT C AACGATATCG CCGATTCATT 

5 01 GGATGAAACC AACACTAAGG CAGACGAAGC CGTCAAAACC GCCAATGAAG 

551 CCAAACAGAC GGC CGAAGAA AC C AAAC AAA ACGTCGATGC CAAAGTAAAA 

601 GCTGCAGAAA CTGCAGCAGG CAAAGCCGAA GCTGCCGCTG GCACAGCTAA 

651 TACTGCAGCC GACAAGGCCG AAGCTGTCGC TGCAAAAGTT AC CGAC ATCA 

20 7 01 AAGCTGATAT CGCTACGAAC AAAGATAATA TTGCTAAAAA AGCAAACAGT 

7 51 GCCGACGTGT ACACCAGAGA AGAGTC TG AC AGCAAATTTG TCAGAATTGA 

8 01 TGGTCTGAAC GCTACTACCG AAAAATTGGA CACACGCTTG GCTTCTGCTG 
851 AAAAAT C CAT TGCCGATCAC GATACTCGCC TGAACGGTTT GGATAAAACA 

9 01 GTGTCAGACC TGCGCAAAGA AACCCGCCAA GGCCTTGCAG AACAAGCCGC 
25 951 GCTCTCCGGT CTGTTCCAAC CTTACAACGT GGGTGGATCC GGAGGAGGAG 

1001 G ATC AG ATT T GGCAAACGAT TCTTTTATCC GGCAGGTTCT CGACCGTCAG 

1051 CATTTCGAAC CCGACGGGAA AT AC C AC C T A TTCGGCAGCA GGGGGGAACT 

1101 TGCCGAGCGC AGCGGCCATA TCGGATTGGG AAAAAT ACAA AGCCATCAGT 

1151 TGGGCAACCT GATGATTCAA CAGGCGGCCA TTAAAGGAAA TATCGGCTAC 

30 12 01 ATTGTCCGCT TTTCCGATCA CGGGCACGAA GTCCATTCCC CCTTCGACAA 

12 51 CCATGCCTCA CATTCCGATT CTGATGAAGC CGGTAGTCCC GTTGACGGAT 

13 01 TTAGCCTTTA CCGCATCCAT TGGGACGGAT ACGAACACCA TCCCGCCGAC 
1351 GGCTATGACG GGCCACAGGG CGGCGGCTAT CCCGCTCCCA AAGGCGCGAG 
1401 GGATATATAC AGC TACGAC A TAAAAGGCGT TGCCCAAAAT ATCCGCCTCA 

35 1451 ACCTGACCGA CAACCGCAGC ACCGGACAAC GGCTTGCCGA CCGTTTCCAC 

1501 AATGCCGGTA GTATGCTGAC GCAAGGAGTA GGCGAC GGAT TCAAACGCGC 

1551 CACCCGATAC AGCCCCGAGC TGGACAGATC GGGCAATGCC GC CGAAGC CT 

1601 TC AAC GGC AC TGCAGATATC GTTAAAAACA TCATCGGCGC GGCAGGAGAA 

1651 ATTGTCGGCG CAGGCGATGC CGTGCAGGGC ATAAGCGAAG GCTCAAACAT 

40 1701 TGCTGTCATG CACGGCTTGG GTCTGCTTTC C AC C GAAAAC AAGATGGCGC 

17 51 GCATCAACGA TTTGGCAGAT ATGGCGCAAC TC AAAGAC T A TGCCGCAGCA 

1801 GCCATCCGCG ATTGGGCAGT CCAAAACCCC AATGCCGCAC AAGGCATAGA 

1851 AGC C GT C AGC AATATCTTTA TGGCAGCCAT CCCCATCAAA GGGATTGGAG 

1901 CTGTTCGGGG AAAAT AC GGC TTGGGCGGCA TCACGGCACA TCCTATCAAG 

45 1951 CGGTCGCAGA TGGGCGCGAT CGCATTGCCG AAAGGGAAAT CCGCCGTCAG 

2001 CGACAATTTT GCCGATGCGG CATACGCCAA ATACCCGTCC CCTTACCATT 

2051 CCCGAAATAT CCGTTCAAAC TTGGAGCAGC GTTACGGCAA AGAAAACATC 

2101 ACCTCCTCAA CCGTGCCGCC GTC AAAC GGC AAAAATGTCA AACTGGCAGA 

2151 CCAACGCCAC CC GAAG AC AG GCGTACCGTT TGACGGTAAA GGGTTTCCGA 

50 2201 ATTTTGAGAA GC AC GTGAAA TATGATACGC TCGAGCACCA CCACCACCAC 

2251 CACTGA 

1 MATNDDDVKK AATVAIAAAY NNGQEINGFK AGE T I YD IDE DGT I TKKDAT 

51 AADVEADDFK GLGLKKWTN LTKTVNENKQ OTDAKVKAAE SEIEKLTTKL 

55 101 ADTDAALADT DAALDATTNA LNKLGENITT FAEETKTNIV KIDEKLEAVA 

151 DTVDKHAEAF NDIADSLDET NTKADEAVKT ANEAKQTAEE TKQNVDAKVK 

201 AAETAAGKAE AAAGTANTAA DKAEAVAAKV TDIKADIATN KDNIAKKANS 

251 ADVYTREESD SKFVRIDGLN ATTEKLDTRL ASAEKS I ADH DTRLNGLDKT 

301 VSDLRKETRQ GLAEQAAL SG LFQPYNVGGS GGGGS DL AND SFIRQVLDRQ 

60 3 51 HFEPDGKYHL FGSRGELAER SGHIGLGKIQ SHQLGNLMIQ QAAIKGNIGY 

401 IVRFSDHGHE VHSPFDNHAS HSDSDEAGSP VDGFSLYRIH WDGYEHHPAD 

451 GYDGPQGGGY PAPKGARDIY S YD IKGVAQN IRBNLTDNRS TGQRLADRFH 

501 NAGSMLiTQGV GDGFKRATRY SPELDRSGNA AEAFNGTADI VKNIIGAAGE 

551 IVGAGDAVQG ISEGSNIAVM HGLGLLSTEN KMARINDLAD MAQLKDYAAA 

65 601 AIRDWAVQNP NAAQGIEAVS NIFMAAIPIK GIGAVRGKYG LGGITAHPIK 

651 RSQMGAIALP KGKSAVSDNF ADAAYAKYPS PYHSRNIRSN LEQRYGKENI 

701 TSSTVPPSNG KNVKLADQRH PKTGVPFDGK GFPNFEKHVK YDTLEHHHHH 
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ATGGCCACAA 
TGCTGCCTAC 
CCATCTACGA 
GCAGCCGATG 
CGTGACTAAC 
C C AAAGT AAA 
GCAGACACTG 
CACCAACGCC 
AG AC T AAG AC 
GATACCGTCG 
GGATGAAACC 
CCAAACAGAC 
GCTGCAGAAA 
TACTGCAGCC 
AAGCTGATAT 
GCCGACGTGT 
TGGTCTGAAC 
AAAAATC C AT 
GTGTCAGACC 
GCTCTCCGGT 
GTGTCGCCGC 
CTCGACCATA 
CAGGAAAAAC 
ATGGAAACGG 
AGCCGTTTCG 
CTTGGAGAGT 
CCGCCTTTCA 
GTTGCGAAAC 
TTTTGACAAG 
TCGGTTCAGA 
GCCAAGCAGG 
TGTCGACCTG 
TCATCAGCGG 
CTCGGTATCT 
GAAAACCGTA 
AGCACCACCA 



ACGACGACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
AAATATCGTA 
ACAAGCATGC 
AACACTAAGG 
GGCCGAAGAA 
CTGCAGCAGG 
GACAAGGCCG 
CGCTACGAAC 
AC AC CAG AG A 
GCTACTACCG 
TGCCGATCAC 
TGCGCAAAGA 
CTGTTCCAAC 
CGACATCGGT 
AAGACAAAGG 
GAGAAAC TGA 
TGACAGCCTC 
ACTTTATCCG 
GGAGAGTTCC 
GACCGAGCAA 
GCCAGTTCAG 
CTTCCCGAAG 
CGATGCCGGC 
GAAACGGCAA 
GCCGCCGCCG 
TTCCGTCCTT 
TTGGCGGAAA 
AACGGCATAC 
CCACCACCAC 



TGTTAAAAAA 
AAGAAATCAA 
GACGGCACAA 
CGACTTTAAA 
CCGTCAATGA 
TCTGAAATAG 
AGCAGATACT 
TGGGAGAAAA 
AAAATTGATG 
CGAAGCATTC 
CAGACGAAGC 
AC C AAAC AAA 
CAAAGCCGAA 
AAGCTGTCGC 
AAAGATAATA 
AGAGTCTGAC 
AAAAATTGGA 
GATACTCGCC 
AACCCGCCAA 
CTTACAACGT 
GCGGGGCTTG 
TTTGCAGTCT 
AGCTGGCGGC 
AATACGGGCA 
CCAAATCGAA 
AAGTATACAA 
ATACAAGATT 
AATCGGCGAC 
GCGGCAGGGC 
GGAAAACTGA 
AATCGAACAT 
ATATCAAGCC 
TACAACCAAG 
AGCCCAGGAA 
GCCATATCGG 
TGA 



GCTGCCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 
TATAACGACA 
AAAAATTAGA 
AACGATATCG 
CGTCAAAACC 
ACGTCGATGC 
GCTGCCGCTG 
TGCAAAAGTT 
TTGCTAAAAA 
AGCAAATTTG 
CACACGCTTG 
TGAACGGTTT 
GGCCTTGCAG 
GGGTGGATCC 
CCGATGCACT 
TTGACGCTGG 
ACAAGGTGCG 
AAT TGAAGAA 
GTGGACGGGC 
AC AAAGC CAT 
CGGAGC ATT C 
ATAGCGGGCG 
GACATATCGC 
CCTACACCAT 
TTGAAATCGC 
GGATGGAAAA 
CCGAGAAAGG 
GTTGCCGGCA 
CCTTGCCGCC 



TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 
TTTGCTGAAG 
AGCCGTGGCT 
CCGATTCATT 
GCCAATGAAG 
CAAAGTAAAA 
GCACAGCTAA 
ACCGACATCA 
AGCAAACAGT 
TCAGAATTGA 
GCTTCTGCTG 
GGATAAAACA 
AACAAGCCGC 
GGAGGGGGTG 
AACCGCACCG 
ATCAGTCCGT 
GAAAAAACTT 
CGACAAGGTC 
AGCTCATTAC 
TCCGCCTTAA 
CGGGAAGATG 
AAC AT AC AT C 
GGGACGGCGT 
AGATTTCGCC 
CAGAACTCAA 
CGCCATGCCG 
CAGTTACTCC 
GCGCGGAAGT 
AAGCAACTCG 



45 



50 



55 



60 



65 



i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 



MATNDDDVKK 
AADVEADDFK 
ADTDAALADT 
DTVDKHAEAF 
AAETAAGKAE 
ADVYTREESD 
VSDLRKETRQ 
LDHKDKGLQS 
SRFDFIRQIE 
VAKRQFRIGD 
AKQGNGKIEH 
LG I FGGKAQE 



AATVAIAAAY 
GLGLKKWTN 
DAALDATTNA 
ND I AD S LrDET 
AAAGTANTAA 
SKFVRIDGLN 
GLAEQAALSG 
LTLDQSVRKN 
VDGQLITLES 
IAGEHTSFDK 
LKSPELWDL 
VAGSAEVKTV 



NNGQEINGFK 
LTKTVNENKQ 
LNKLGENITT 
NTKADEAVKT 
DKAEAVAAKV 
ATTEKLDTRL 
LFQPYNVGGS 
EKLKLAAQGA 
GEFQVYKQSH 
LPEGGRATYR 
AAADIKPDGK 
NGIRHIGLAA 



961C-983 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 



ATGGCCACAA 
TGCTGCCTAC 
CCATCTACGA 
GCAGCCGATG 
CGTGACTAAC 
C C AAAGT AAA 
GCAGACACTG 
CACCAACGCC 
AG AC T AAG AC 
GATACCGTCG 
GGATGAAACC 



ACGACGACGA 
AACAATGGCC 
CATTGATGAA 
TTGAAGCCGA 
CTGACCAAAA 
AGCTGCAGAA 
ATGCCGCTTT 
TTGAATAAAT 
AAATATCGTA 
ACAAGCATGC 
AACACTAAGG 



TGTTAAAAAA 
AAGAAATCAA 
GACGGCACAA 
CGACTTTAAA 
CCGTCAATGA 
TCTGAAATAG 
AGCAGATACT 
TGGGAGAAAA 
AAAATTGATG 
CGAAGCATTC 
CAGACGAAGC 



AGETIYDIDE 
NVDAKVKAAE 
FAEETKTNIV 
ANEAKQTAEE 
TDIKADIATN 
ASAEKSIADH 
GGGGVAAD I G 
EKTYGNGDSL 
SALTAFQTEQ 
GTAFGSDDAG 
RHAVISGSVL 
KQLEHHHHHH 



GCTGCCACTG 
CGGTTTCAAA 
TTACCAAAAA 
GGTCTGGGTC 
AAACAAACAA 
AAAAGTTAAC 
GATGCCGCTC 
TATAACGACA 
AAAAATTAGA 
AACGATATCG 
CGTCAAAACC 



DGTITKKDAT 
SEIEKLTTKL 
KIDEKLEAVA 
TKQNVDAKVK 
KDNIAKKANS 
DTRLNGLDKT 
AGLADALTAP 
NTGKLKNDKV 
IQDSEHSGKM 
GKLTYTIDFA 
YNQAEKGSYS 



TGGCCATTGC 
GCTGGAGAGA 
AGACGCAACT 
TGAAAAAAGT 
AACGTCGATG 
AACCAAGTTA 
TGGATGCAAC 
TTTGCTGAAG 
AGCCGTGGCT 
CCGATTCATT 
GCCAATGAAG 
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551 CCAAACAGAC GGCCGAAGAA AC C AAAC AAA ACGTCGATGC CAAAGTAAAA 

601 GCTGCAGAAA CTGCAGCAGG CAAAGCCGAA GCTGCCGCTG GCACAGCTAA 

651 TACTGCAGCC GACAAGGCCG AAGCTGTCGC TGCAAAAGTT AC C G AC AT C A 

7 01 AAGCTGATAT CGCTACGAAC AAAGATAATA TTGCTAAAAA AGCAAACAGT 

5 751 GCCGACGTGT ACACCAGAGA AGAGTC TGAC AGCAAATTTG TCAGAATTGA 

801 TGGTCTGAAC GCTACTACCG AAAAATTGGA CACACGCTTG GCTTCTGCTG 

851 AAAAATCCAT TGCCGATCAC GATACTCGCC TGAACGGTTT GGATAAAACA 

901 GTGTCAGACC TGCGCAAAGA AACCCGCCAA GGCCTTGCAG AACAAGCCGC 

951 GCTCTCCGGT CTGTTCCAAC CTTACAACGT GGGTGGATCC GGCGGAGGCG 

10 1001 GCACTTCTGC GCCCGACTTC AATGCAGGCG GTACCGGTAT CGGCAGCAAC 

1051 AGCAGAGCAA CAACAGCGAA ATCAGCAGCA GTATCTTACG CCGGTATCAA 

1101 GAAC GAAATG TGCAAAGACA GAAGCATGCT CTGTGCCGGT CGGGATGACG 

1151 TTGCGGTTAC AGACAGGGAT GCCAAAATCA ATGCCCCCCC CCCGAATCTG 

12 01 CATACCGGAG ACTTTC C AAA CCCAAATGAC GCATACAAGA ATTTGATCAA 
15 1251 CCTCAAACCT GCAATTGAAG CAGGCTATAC AGGACGCGGG GTAGAGGTAG 

13 01 GTATCGTCGA CACAGGCGAA TC CGTCGGC A GCATATCCTT TCCCGAACTG 
13 51 TATGGCAGAA AAGAACACGG CTATAACGAA AATTACAAAA AC T AT ACGGC 
1401 GTATATGCGG AAGGAAGCGC CTGAAGACGG AGGC GGTAAA GACATTGAAG 
1451 CTTCTTTCGA CGATGAGGCC GTTATAGAGA CTGAAGCAAA GCCGACGGAT 

20 15 01 ATCCGCCACG TAAAAGAAAT CGGAC AC AT C GATTTGGTCT C C CAT AT TAT 

1551 TGGCGGGCGT TCCGTGGACG GCAGACCTGC AGGC GGT ATT GCGCCCGATG 

16 01 CGACGCTACA CATAATGAAT AC GAATGATG AAACCAAGAA CGAAATGATG 

1651 GTTGCAGCCA TCCGCAATGC ATGGGTCAAG CTGGGCGAAC GTGGCGTGCG 

1701 CATCGTCAAT AACAGTTTTG GAAC AAC AT C GAGGGCAGGC ACTGCCGACC 

25 1751 TTTTCCAAAT AGCCAATTCG GAGGAGCAGT ACCGCCAAGC GTTGCTCGAC 

1801 TATTCCGGCG GTGATAAAAC AGACGAGGGT ATCCGCCTGA TGCAACAGAG 

1851 CGATTACGGC AACCTGTCCT ACCACATCCG TAATAAAAAC ATGCTTTTCA 

1901 TCTTTTCGAC AGGCAATGAC GCACAAGCTC AGCCCAACAC AT ATGC CCT A 

1951 TTGCCATTTT ATGAAAAAGA CGCTCAAAAA GGCATTATCA CAGTCGCAGG 

30 2 001 CGTAGACCGC AGTGGAGAAA AGTTCAAACG GGAAAT GT AT GGAGAACCGG 

2 051 GTACAGAACC GCTTGAGTAT GGCTCCAACC ATTGCGGAAT TACTGCCATG 

2101 TGGTGCCTGT CGGCACCCTA TGAAGCAAGC GTCCGTTTCA C C CGT AC AAA 

2151 CCCGATTCAA ATTGCCGGAA CATCCTTTTC CGCACCCATC GTAACCGGCA 

2201 CGGCGGCTCT GCTGCTGCAG AAATACCCGT GGATGAGCAA CGACAACCTG 

35 2 251 CGTACCACGT TGCTGACGAC GGCTCAGGAC ATCGGTGCAG TCGGCGTGGA 

23 01 CAGCAAGTTC GGCTGGGGAC TGCTGGATGC GGGTAAGGCC ATGAACGGAC 

2351 CCGCGTCCTT TCCGTTCGGC GACTTTACCG CCGATACGAA AGGTACATCC 

2401 GATATTGCCT ACTCCTTCCG TAACGACATT TCAGGCACGG GCGGCCTGAT 
2451 CAAAAAAGGC GGCAGCCAAC TGCAACTGCA CGGCAACAAC AC C T AT ACGG 

40 2501 GCAAAACCAT TAT CGAAGGC GGTTCGCTGG TGTTGTACGG CAACAACAAA 

2551 TCGGATATGC GCGTCGAAAC CAAAGGTGCG CTGATTTATA ACGGGGCGGC 

2 601 ATC CGGCGGC AGCCTGAACA GCGACGGCAT TGTCTATCTG GCAGATACCG 
2 651 ACCAATCCGG CGCAAACGAA ACCGTACACA TCAAAGGCAG TCTGCAGCTG 
2701 GACGGCAAAG GTACGCTGTA CACACGTTTG GGCAAACTGC TGAAAGTGGA 

45 2751 CGGTACGGCG ATTATCGGCG GCAAGCTGTA CATGTCGGCA CGCGGCAAGG 

2 801 GGGCAGGCTA TCTCAACAGT AC C GG AC G AC GTGTTCCCTT CCTGAGTGCC 
2 851 GCCAAAATCG GGCAGGATTA TTCTTTCTTC ACAAACATCG AAACCGACGG 
2901 CGGCCTGCTG GCTTCCCTCG ACAGCGTCGA AAAAACAGCG GGCAGTGAAG 

2 951 GCGACACGCT GT C C TAT TAT GTCCGTCGCG GCAATGCGGC ACGGACTGCT 
50 3 001 TCGGCAGCGG CACATTCCGC GCCCGCCGGT CTGAAACACG CCGTAGAACA 

3 051 GGGCGGCAGC AATCTGGAAA ACCTGATGGT CGAACTGGAT GCCTCCGAAT 
3101 CATCCGCAAC AC C C GAGACG GTTGAAACTG CGGCAGCCGA CCGCACAGAT 
3151 ATGCCGGGCA TCCGCCCCTA CGGCGCAACT TTCCGCGCAG CGGCAGCCGT 
3201 ACAGCATGCG AATGCCGCCG ACGGTGTACG CATCTTCAAC AGTCTCGCCG 

55 3251 CTACCGTCTA TGCCGACAGT ACCGCCGCCC ATGC CGATAT GCAGGGACGC 

3 3 01 CGCCTGAAAG CCGTATCGGA CGGGTTGGAC CACAACGGCA CGGGTCTGCG 
33 51 CGTCATCGCG CAAACCCAAC AGGACGGTGG AACGTGGGAA CAGGGCGGTG 
3401 TTGAAGGCAA AATGCGCGGC AGT AC C C AAA C CGTCGGC AT TGCCGCGAAA 
3451 ACCGGCGAAA ATACGACAGC AGCCGCCACA CTGGGCATGG GACGCAGCAC 

60 3 501 ATGGAGCGAA AACAGTGCAA ATGCAAAAAC CGACAGCATT AGTCTGTTTG 

3 551 CAGGCATACG GCACGATGCG GGCGATATCG GCTATCTCAA AGGCCTGTTC 
3 601 TCCTACGGAC GCTACAAAAA CAGCATCAGC CGCAGCACCG GTGCGGACGA 
3 651 ACATGCGGAA GGCAGCGTCA ACGGC ACGCT GATGCAGCTG GGCGCACTGG 
37 01 GCGGTGTCAA CGTTCCGTTT GCCGCAACGG GAGATT TGAC GGTCGAAGGC 

65 3751 GGTCTGCGCT ACGACCTGCT CAAACAGGAT GCATTCGCCG AAAAAGGCAG 

3 8 01 TGCTTTGGGC TGGAGCGGCA ACAGCCTCAC TGAAGGCACG CTGGTCGGAC 
3 851 TCGCGGGTCT GAAGCTGTCG CAACCCTTGA GCGATAAAGC CGTCCTGTTT 
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3 901 GCAACGGCGG GCGTGGAACG CGACCTGAAC GGACGCGACT ACACGGTAAC 

3951 GGGCGGCTTT ACCGGCGCGA CTGCAGCAAC CGGCAAGACG GGGGCACGCA 

4001 ATATGCCGCA CACCCGTCTG GTTGCCGGCC TGGGCGCGGA TGTCGAATTC 

4051 GGCAACGGCT GGAACGGCTT GGCACGTTAC AGCTACGCCG GTTCCAAACA 

5 4101 GTACGGCAAC CACAGCGGAC GAGTCGGCGT AGGCTACCGG TTCCTCGAGC 

4151 ACCACCACCA CCACCACTGA 

1 MATNDDDVKK AATVAIAAAY NNGQEINGFK AGETIYDIDE DGTITKKDAT 

51 AADVEADDFK GL GLKKWTN LTKTVNENKQ WDAKVKAAE SEIEKLTTKL 

10 101 ADTDAALADT DAALDATTNA LNKLGENITT F AEETKTN IV KIDEKLEAVA 

151 DTVDKHAEAF NDIADSLDET NTKADEAVKT ANEAKQTAEE TKQNVDAKVK 

201 AAETAAGKAE AAAGTANTAA DKAEAVAAKV TDIKADIATN KDNIAKKANS 

2 51 ADVYTREESD SKFVRIDGLN ATTEKLDTRL ASAEKSIADH DTRLNGLDKT 

3 01 VSDLRKETRQ GLAEQAALSG DFQPYNVGGS GGGGTSAPDF NAGGTGIGSN 
15 351 SRATTAKSAA VSYAGIKNEM CKDRSMLCAG RDDVAVTDRD AKINAPPPNL 

401 HTGDFPNPND AYKNLINLKP A I E AGYTGRG VEVGIVDTGE SVGSISFPEL 

451 YGRKEHGYNE NYKNYTAYMR KEAPEDGGGK DIEASFDDEA VIETEAKPTD 

501 IRHVKEIGHI DLVSHIIGGR SVDGRPAGGI APDATLHIMN TNDETKNEMM 

551 VAAIRNAWVK LGERGVRIVN NSFGTTSRAG TADLFQ I ANS EEQYRQALLD 

20 601 YSGGDKTDEG IRLMQQSDYG NLSYHIRNKN MLFIFSTGND AQAQPNTYAL 

651 LPFYEKDAQK G 1 I T VAGVDR SGEKFKREMY GEPGTEPLEY GSNHCGITAM 

701 WCLSAPYEAS VRFTRTNPIQ IAGTSFSAPI VTGTAALLLQ KYPWMSNDNL 

7 51 RTTLLTTAQD IGAVGVDSKF GWGLLDAGKA MNGPASFPFG DFTADTKGTS 

8 01 DIAYSFRNDI SGTGGLIKKG GSQLQLHGNN TYTGKTIIEG GSLVLYGNNK 
25 851 SDMRVETKGA LIYNGAASGG SLNSDGIVYL ADTDQSGANE TVHIKGSLQL 

9 01 DGKGTLYTRL GKLLKVDGTA IIGGKLYMSA RGKGAGYLNS TGRRVPFL S A 
951 AKIGQDYSFF TNIETDGGLL ASLDSVEKTA GSEGDTLSYY VRRGNAARTA 

10 01 SAAAHSAPAG LKHAVEQGGS NLENLMVELD ASESSATPET VETAAADRTD 

1051 MPGIRPYGAT FRAAAAVQHA NAADGVRIFN SLAATVYADS TAAHADMQGR 

30 1101 RLKAVSDGLD HNGTGLRVIA QTQQDGGTWE QGGVEGKMRG STQTVGIAAK 

1151 TGENTTAAAT LGMGRSTWSE NSANAKTDSI SLFAG1RHDA GDIGYLKGLF 

12 01 SYGRYKNSIS RSTGADEHAE GSVNGTLMQL GALGGVNVPF AATGDLTVEG 
1251 GLRYDLLKQD AFAEKGSALG WSGNSLTEGT LVGLAGLKLS QPLSDKAVLF 

13 01 ATAGVERDLN GRDYTVTGGF TGATAATGKT GARNMPHTRL VAGLGADVEF 
35 13 51 GNGWNGLARY SYAGSKQYGN HSGRVGVGYR FLEHHHHHH* 

961cIi-ORF46.1 

1 ATGAAACACT TTCCATCCAA AGTACTGACC ACAGCCATCC TTGCCACTTT 

40 51 CTGTAGCGGC GCACTGGCAG CCACAAACGA CGACGATGTT AAAAAAGCTG 

101 CCACTGTGGC CATTGCTGCT GCCTACAACA ATGGCCAAGA AATCAACGGT 

151 TTCAAAGCTG GAGAGACCAT C T AC GACATT GATGAAGACG GCACAATTAC 

2 01 CAAAAAAGAC GCAACTGCAG CCGATGTTGA AGCCGACGAC TTTAAAGGTC 

2 51 TGGGTCTGAA AAAAGTCGTG ACTAACCTGA CCAAAACCGT CAATGAAAAC 
45 3 01 AAACAAAACG TCGATGCCAA AGTAAAAGCT GCAGAATCTG AAATAGAAAA 

3 51 GTTAACAACC AAGTTAGCAG AC AC T GAT GC CGCTTTAGCA GAT AC TGATG 
401 CCGCTCTGGA TGCAACCACC AACGCCTTGA ATAAAT TGGG AGAAAATATA 
451 AC GAC ATTTG CTGAAGAGAC TAAGACAAAT AT CGT AAAAA TTGATGAAAA 
501 ATTAGAAGCC GTGGCTGATA C C GT C GAC AA GCATGCCGAA GCATTCAACG 

50 551 ATATCGCCGA TTCATTGGAT GAAACCAACA CTAAGGCAGA CGAAGCCGTC 

601 AAAACCGCCA ATGAAGCCAA ACAGACGGCC GAAGAAACCA AACAAAACGT 

651 CGATGCCAAA GTAAAAGCTG CAGAAACTGC AGCAGGCAAA GCCGAAGCTG 

7 01 CCGCTGGCAC AGCTAATACT GC AG C C GAC A AGGC CGAAGC TGTCGCTGCA 

751 AAAGTTACCG ACATCAAAGC TGATATCGCT ACGAACAAAG ATAATATTGC 

55 8 01 TAAAAAAGCA AACAGTGCCG ACGTGTACAC CAGAGAAGAG TCTGACAGCA 

851 AATTTGTCAG AATTGATGGT CTGAACGCTA CTACCGAAAA ATTGGACACA 

9 01 CGCTTGGCTT CTGCTGAAAA ATCCATTGCC GATC AC GAT A CTCGCCTGAA 

951 CGGTTTGGAT AAAACAGTGT CAGACCTGCG CAAAGAAACC CGCCAAGGCC 

1001 TTGCAGAACA AGCCGCGCTC TCCGGTCTGT TCCAACCTTA CAACGTGGGT 

60 1051 GGATCCGGAG GAGGAGGATC AGATTTGGCA AACGATTCTT TTATCCGGCA 

1101 GGTTCTCGAC CGTCAGCATT TCGAACCCGA C GGGAAAT AC CACCTATTCG 

1151 GCAGCAGGGG GGAACTTGCC GAGCGCAGCG GCCATATCGG ATTGGGAAAA 

12 01 AT AC AAAGC C ATCAGTTGGG CAACCTGATG ATTCAACAGG CGGCCATTAA 

1251 AGGAAATATC GGCTACATTG TCCGCTTTTC CGATCACGGG CACGAAGTCC 

65 1301 ATTCCCCCTT C G AC AAC CAT GCCTCACATT CCGATTCTGA TGAAGCCGGT 

1351 AGTCCCGTTG AC GGATTTAG CCTTTACCGC ATCCATTGGG AC GG AT AC G A 

1401 ACACCATCCC GCCGACGGCT ATGACGGGCC ACAGGGCGGC GGCTATCCCG 
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1451 
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1 
51 
101 
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201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 



CTCCCAAAGG 
CAAAATATCC 
TGCCGACCGT 
ACGGATTCAA 
AATGCCGCCG 
CGGCGCGGCA 
GCGAAGGCTC 
GAAAACAAGA 
AGACTATGCC 
CCGCACAAGG 
AT C AAAGGGA 
GGCACATCCT 
GGAAATCCGC 
CCGTCCCCTT 
CGGCAAAGAA 
ATGTCAAACT 
GGTAAAGGGT 
CGAG 

MKHFPSKVLT 
FKAGETIYDI 
KQNVDAKVKA 
TTFAEETKTN 
KTANEAKQTA 
KVTDIKADIA 
RLASAEKSIA 
GSGGGGSDDA 
1QSHQLGNLM 
SPVDGFSLYR 
QNIRLNLTDN 
NAAEAFNGTA 
ENKMARINDL 
I KG I G AVRGK 
PSPYHSRNIR 
GKGFPNFEKH 



CGCGAGGGAT 
GCCTCAACCT 
TTCCACAATG 
ACGCGCCACC 
AAGCCTTCAA 
GGAGAAATTG 
AAACATTGCT 
TGGCGCGCAT 
GCAGCAGCCA 
C AT AGAAGC C 
TTGGAGCTGT 
ATCAAGCGGT 
CGTCAGCGAC 
ACCATTCCCG 
AACATCACCT 
GGCAGACCAA 
TTCCGAATTT 



TAILiATFCSG 
DEDGTITKKD 
AESEIEKLTT 
IVKIDEKLEA 
EETKQNVDAK 
TNKDNIAKKA 
DHDTRLNGLD 
NDSFIRQVLD 
IQQAAIKGNI 
IHWDGYEHHP 
RSTGQRLADR 
DIVKNIIGAA 
ADMAQLKDYA 
YGLGGITAHP 
SNLEQRYGKE 
VKYDT* 



AT AT AC AGC T 
GACCGACAAC 
CCGGTAGTAT 
CGATACAGCC 
CGGCACTGCA 
TCGGCGCAGG 
GTCATGCACG 
CAACGATTTG 
TCCGCGATTG 
GTCAGCAATA 
TCGGGGAAAA 
CGCAGATGGG 
AATTTTGCCG 
AAATATCCGT 
CCTCAACCGT 
CGCCACCCGA 
TGAGAAGCAC 



ALAATNDDDV 
ATAADVEADD 
KLADTDAALA 
VADTVDKHAE 
VKAAETAAGK 
NSADVYTREE 
KTVSDLRKET 
RQHFEPDGKY 
GYIVRFSDHG 
ADGYDGPQGG 
FHNAGSMLTQ 
GE I VGAGDAV 
AAAIRDWAVQ 
IKRSQMGAIA 
NITSSTVPPS 



AC G AC AT AAA 
CGCAGCACCG 
GCTGACGCAA 
CCGAGCTGGA 
GATATCGTTA 
CGATGCCGTG 
GCTTGGGTCT 
GCAGATATGG 
GGCAGTCCAA 
TCTTTATGGC 
TACGGCTTGG 
CGCGATCGCA 
ATGCGGCATA 
TCAAACTTGG 
GCCGCCGTCA 
AGACAGGCGT 
GTGAAATATG 



KKAATVAIAA 
FKGLGLKKW 
DTDAALDATT 
AFNDIADSLD 
AEAAAGTANT 
SDSKFVRIDG 
RQGLAEQAAL 
HLFGSRGEIjA 
HEVHSPFDNH 
GYPAPKGARD 
GVGDGFKRAT 
QGISEGSNIA 
NPNAAQGIEA 
LPKGKSAVSD 
NGKNVKLADQ 



AGGCGTTGCC 
G AC AACGGC T 
GGAGTAGGCG 
CAGATCGGGC 
AAAAC AT C AT 
CAGGGCATAA 
GCTTTCCACC 
CGCAACTCAA 
AACCCCAATG 
AGCCATCCCC 
GCGGCATCAC 
TTGCCGAAAG 
CGC C AAATAC 
AGCAGCGTTA 
AACGGCAAAA 
ACCGTTTGAC 
ATACGTAACT 



AYNNGQEING 
TNLTKTVNEN 
NALNKLGENI 
ETNTKADEAV 
AADKAEAVAA 
LNATTEKLDT 
SGLFQPYNVG 
ERSGHIGLGK 
ASHSDSDEAG 
IYSYDIKGVA 
RYSPELDRSG 
VMHGLGLLST 
VSNIFMAAIP 
NFADAAYAKY 
RHPKTGVPFD 



961CL-741 

1 ATGAAACACT TTCCATCCAA AGT ACT GAC C ACAGCCATCC TTGCCACTTT 

40 51 CTGTAGCGGC GCACTGGCAG C C AC AAACGA CGACGATGTT AAAAAAGCTG 

101 CCACTGTGGC CATTGCTGCT GCCTACAACA ATGGCCAAGA AATCAACGGT 

151 TTCAAAGCTG GAGAGACCAT CTACGACATT GATGAAGACG GCACAATTAC 

2 01 CAAAAAAGAC GCAACTGCAG CCGATGTTGA AGC C GACGAC TTTAAAGGTC 
251 TGGGTCTGAA AAAAGTCGTG ACTAAC CTGA C C AAAAC C GT CAATGAAAAC 

45 3 01 AAAC AAAAC G TCGATGCCAA AGTAAAAGC T GCAGAATCTG AAATAGAAAA 

3 51 GTTAACAACC AAGTT AG C AG AC AC TGATGC CGCTTTAGCA GATACTGATG 
401 CCGCTCTGGA TGCAACCACC AACGCCTTGA ATAAAT TGGG AGAAAATATA 
451 ACGACATTTG CTGAAGAGAC TAAGACAAAT ATCGTAAAAA TTGATGAAAA 
5 01 ATT AGAAGC C GTGGCTGATA CCGTCGACAA GCATGCCGAA GCATTCAACG 

50 551 ATATCGCCGA TTCATTGGAT GAAACCAACA CTAAGGCAGA CGAAGCCGTC 

601 AAAAC C GC C A ATGAAGCCAA ACAGACGGCC GAAGAAAC C A AAC AAAAC GT 

651 CGATGCCAAA GTAAAAGCTG CAGAAACTGC AGCAGGCAAA GCCGAAGCTG 

7 01 CCGCTGGCAC AGCTAAT AC T GCAGCCGACA AGGCCGAAGC TGTCGCTGCA 

751 AAAGTTACCG ACATCAAAGC TGATATCGCT ACGAACAAAG ATAATATTGC 

55 8 01 TAAAAAAGCA AACAGTGCCG ACGTGTACAC CAGAGAAGAG TCTGACAGCA 

851 AATTTGTCAG AATTGATGGT C T G AAC GC T A C T AC CGAAAA ATTGGACACA 

9 01 CGCTTGGCTT CTGCTGAAAA ATCCATTGCC GATCACGATA CTCGCCTGAA 

951 CGGTTTGGAT AAAACAGTGT CAGACCTGCG CAAAGAAACC CGCCAAGGCC 

1001 TTGCAGAACA AGCCGCGCTC TCCGGTCTGT TCCAACCTTA CAACGTGGGT 

60 1051 GGATC CGGAG GGGGTGGTGT CGCCGCCGAC ATCGGTGCGG GGCTTGCCGA 

1101 TGC ACTAAC C GCACCGCTCG ACCATAAAGA CAAAGGTTTG CAGTCTTTGA 

1151 CGCTGGATCA GTCCGTCAGG AAAAAC GAGA AACTGAAGCT GGCGGCACAA 

12 01 GGTGCGGAAA AAACTTATGG AAACGGTGAC AGCCTCAATA CGGGCAAATT 
1251 GAAGAACGAC AAGGTCAGCC GTTTCGACTT TATCCGCCAA ATCGAAGTGG 

65 13 01 ACGGGCAGCT CATTACCTTG GAGAGTGGAG AGTTCCAAGT ATACAAACAA 

13 51 AGCCATTCCG CCTTAACCGC CTTTCAGACC GAGCAAATAC AAGATTCGGA 
1401 GCATTCCGGG AAGATGGTTG CGAAACGCCA GTTCAGAATC GGCGACATAG 
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CGGGCGAACA 
TATCGCGGGA 
C AC CAT AG AT 
AATCGCCAGA 
GGAAAACGCC 
GAAAGGCAGT 
CCGGCAGCGC 
GCCGCCAAGC 

MKHFPSKVLT 
FKAGET I YD I 
KQNVDAKVKA 
TTFAEETKTN 
KTANEAKQTA 
KVTDIKADIA 
RLASAEKS I A 
GSGGGGVAAD 
GAEKTYGNGD 
SHSALTAFQT 
YRGTAFGSDD 
GKRHAVISGS 
AAKQLEHHHH 



TACATCTTTT 
CGGCGTTCGG 
TTCGCCGCCA 
AC TC AATGTC 
ATGCCGTCAT 
TACTCCCTCG 
GGAAGTGAAA 
AACTCGAGCA 

TAILATFCSG 
DEDGTITKKD 
AESEIEKLTT 
IVKIDEKLEA 
EETKQNVDAK 
TNKDNIAKKA 
DHDTRLNGLD 
IGAGLADALT 
SLNTGKLKND 
EQIQDSEHSG 
AGGKLTYTID 
VLYNQAEKGS 
HH* 



GACAAGCTTC 
TTCAGACGAT 
AGCAGGGAAA 
GACCTGGCCG 
CAGCGGTTCC 
GTATCTTTGG 
AC CGTAAAC G 
CCACCACCAC 

ALAATNDDDV 
ATAADVEADD 
KLADTDAALA 
VADTVDKHAE 
VKAAETAAGK 
NSADVYTREE 
KTVSDLRKET 
APLDHKDKGL 
KVSRFDFIRQ 
KMVAKRQFRI 
FAAKQGNGK I 
YSLGIFGGKA 



CCGAAGGCGG 
GCCGGCGGAA 
CGGCAAAATC 
CCGCCGATAT 
GTCCTTTACA 
CGGAAAAGCC 
GCATACGCCA 
CACCACTGA 

KKAATVAIAA 
FKGLGLKKW 
DTDAALDATT 
AFNDIADSLD 
AEAAAGTANT 
SDSKFVRIDG 
RQGLAEQAAL 
QSIVTLiDQSVR 
IEVDGQLITL 
GDIAGEHTSF 
EHLKS PELNV 
QEVAGSAEVK 



CAGGGCGACA 
AACTGACCTA 
GAACATTTGA 
CAAGCCGGAT 
ACCAAGCCGA 
CAGGAAGTTG 
TATCGGCCTT 



AYNNGQEING 
TNLTKTVNEN 
NALNKLGENI 
ETNTKADEAV 
AADKAEAVAA 
LNATTEKLDT 
SGLFQPYNVG 
KNEKLKLAAQ 
ESGEFQVYKQ 
DKLPEGGRAT 
DLAAADIKPD 
TVNGIRHIGL 



25 



30 



35 



40 



45 



50 



55 



60 



65 



961cL-983 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 



ATGAAACACT 
CTGTAGCGGC 
CCACTGTGGC 
TTCAAAGCTG 
CAAAAAAGAC 
TGGGTCTGAA 
AAACAAAACG 
GTTAACAACC 
CCGCTCTGGA 
ACGACATTTG 
AT TAGAAGC C 
ATATCGCCGA 
AAAACCGCCA 
CGATGCCAAA 
CCGCTGGCAC 
AAAGTTACCG 
TAAAAAAGCA 
AATTTGTCAG 
CGCTTGGCTT 
CGGTTTGGAT 
TTGCAGAACA 
GGATCCGGCG 
CGGTATCGGC 
CTTACGCCGG 
GCCGGTCGGG 
CCCCCCCCCG 
ACAAGAATTT 
CGCGGGGTAG 
ATCCTTTCCC 
AC AAAAAC T A 
GGTAAAGACA 
AGCAAAGCCG 
TGGTCTCCCA 
GGTATTGCGC 
CAAGAACGAA 
GCGAACGTGG 
GCAGGCACTG 
CCAAGCGTTG 
GCCTGATGCA 
AAAAACATGC 
CAACACATAT 
TTATCACAGT 



TTCCATCCAA 
GCACTGGCAG 
CATTGCTGCT 
GAGAGACCAT 
GCAACTGCAG 
AAAAGTCGTG 
TCGATGCCAA 
AAGTTAGCAG 
TGCAACCACC 
CTGAAGAGAC 
GTGGCTGATA 
TTCATTGGAT 
ATGAAGCCAA 
GTAAAAGCTG 
AGCTAATACT 
AC AT C AAAGC 
AACAGTGCCG 
AATTGATGGT 
CTGCTGAAAA 
AAAACAGTGT 
AGCCGCGCTC 
GAGGCGGCAC 
AGCAACAGCA 
TAT C AAGAAC 
ATGACGTTGC 
AATCTGCATA 
GATCAACCTC 
AGGTAGGTAT 
GAACTGTATG 
TACGGCGTAT 
TTGAAGCTTC 
ACGGATATCC 
TATTATTGGC 
CCGATGCGAC 
ATGATGGTTG 
CGTGCGCATC 
CCGACCTTTT 
CTCGACTATT 
ACAGAGCGAT 
TTTTCATCTT 
GCCCTATTGC 
CGCAGGCGTA 



AGTACTGACC 
CCACAAACGA 
GCCTACAACA 
CTACGACATT 
CCGATGTTGA 
ACTAACCTGA 
AGTAAAAGCT 
ACACTGATGC 
AACGCCTTGA 
TAAGACAAAT 
CCGTCGACAA 
GAAACCAACA 
ACAGACGGCC 
CAGAAACTGC 
GCAGCCGACA 
TGATATCGCT 
ACGTGTACAC 
CTGAACGCTA 
ATC CAT TGC C 
CAGACCTGCG 
TCCGGTCTGT 
TTCTGCGCCC 
GAGCAACAAC 
GAAATGTGCA 
GGTTACAGAC 
CCGGAGACTT 
AAACCTGCAA 
CGTCGACACA 
GCAGAAAAGA 
ATGCGGAAGG 
TTTCGACGAT 
GCCACGTAAA 
GGGCGTTCCG 
GC T AC AC ATA 
CAGCCATCCG 
GTCAATAACA 
CCAAATAGCC 
CCGGCGGTGA 
TACGGCAACC 
TTCGACAGGC 
C AT TT TAT GA 
GACCGCAGTG 



ACAGCCATCC 
CGACGATGTT 
ATGGCCAAGA 
GATGAAGACG 
AGCCGACGAC 
CCAAAACCGT 
GCAGAATCTG 
CGCTTTAGCA 
ATAAATTGGG 
ATCGTAAAAA 
GCATGCCGAA 
CTAAGGCAGA 
GAAGAAAC C A 
AGCAGGCAAA 
AGGC CGAAGC 
ACGAACAAAG 
CAGAGAAGAG 
CTACCGAAAA 
GATCACGATA 
CAAAGAAACC 
TCCAACCTTA 
GACTTCAATG 
AGCGAAAT C A 
AAGACAGAAG 
AGGGATGC C A 
TCCAAACCCA 
TTGAAGCAGG 
GGCGAATCCG 
ACACGGCTAT 
AAGCGCCTGA 
GAGGCCGTTA 
AGAAATCGGA 
TGGACGGCAG 
ATGAATACGA 
CAATGCATGG 
GTTTTGGAAC 
AATTCGGAGG 
TAAAACAGAC 
TGTCCTACCA 
AATGACGCAC 
AAAAGACGCT 
GAGAAAAGTT 



TTGCCACTTT 
AAAAAAGCTG 
AATCAACGGT 
GCACAATTAC 
TTTAAAGGTC 
CAATGAAAAC 
AAATAGAAAA 
GAT AC TGATG 
AGAAAATATA 
TTGAT GAAAA 
GCATTCAACG 
CGAAGCCGTC 
AACAAAACGT 
GCCGAAGCTG 
TGTCGCTGCA 
ATAAT AT TGC 
TCTGACAGCA 
ATTGGACACA 
CTCGCCTGAA 
CGCCAAGGCC 
CAACGTGGGT 
CAGGCGGTAC 
GCAGCAGTAT 
CATGCTCTGT 
AAATCAATGC 
AATGACGCAT 
CTATACAGGA 
TCGGCAGCAT 
AAC GAAAATT 
AGACGGAGGC 
TAGAGACTGA 
C AC AT CGATT 
AC C TGC AGGC 
ATGATGAAAC 
GTCAAGCTGG 
AACATCGAGG 
AGCAGTACCG 
GAGGGT AT C C 
CATCCGTAAT 
AAGCTCAGCC 
CAAAAAGGCA 
CAAACGGGAA 
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2101 ATGTATGGAG AACCGGGTAC AGAACCGCTT GAGTATGGCT CCAACCATTG 

2151 CGGAATTACT GCCATGTGGT GCCTGTCGGC ACCCTATGAA GCAAGCGTCC 

22 01 GTTTCACCCG TACAAACCCG ATTCAAATTG CCGGAACATC CTTTTCCGCA 

2251 CCCATCGTAA CCGGCACGGC GGCTCTGCTG CTGCAGAAAT ACCCGTGGAT 

5 2301 GAGCAACGAC AACCTGCGTA CCACGTTGCT GACGACGGCT CAGGACATCG 

2351 GTGCAGTCGG CGTGGACAGC AAGTTCGGCT GGGGACTGCT GGATGCGGGT 

2401 AAGGCCATGA ACGGACCCGC GTCCTTTCCG TTCGGCGACT TTACCGCCGA 

2451 TACGAAAGGT ACATCCGATA TTGCCTACTC CTTCCGTAAC GAG ATT TC AG 

2501 GCACGGGCGG CCTGATCAAA AAAGGCGGCA GCCAACTGCA ACTGCACGGC 

10 2 551 AACAACACCT ATACGGGCAA AACCATTATC GAAGGCGGTT CGCTGGTGTT 

2 601 GTACGGCAAC AACAAATCGG ATATGCGCGT CGAAACCAAA GGTGCGCTGA 

2 651 TTTATAACGG GGCGGCATCC GGCGGCAGCC TGAACAGCGA CGGCATTGTC 
27 01 TATCTGGCAG ATACCGACCA ATCCGGCGCA AACGAAACCG TACACATCAA 
2751 AGGCAGTCTG CAGCTGGACG GCAAAGGTAC GCTGTACACA CGTTTGGGCA 

15 2 801 AACTGCTGAA AGTGGACGGT ACGGC GAT T A TCGGCGGCAA GCTGTACATG 

2851 TCGGCACGCG GCAAGGGGGC AGGCTATCTC AACAGTACCG GACGACGTGT 

2901 TCCCTTCCTG AGTGCCGCCA AAATCGGGCA GGATTATTCT TTCTTCACAA 

2951 AC AT C GAAAC CGACGGCGGC CTGCTGGCTT CCCTCGACAG CGTCGAAAAA 

3001 ACAGCGGGCA GTGAAGGCGA CACGCTGTCC TATTATGTCC GTCGCGGCAA 

20 3051 TGCGGCACGG ACTGCTTCGG CAGCGGCACA TTCCGCGCCC GCCGGTCTGA 

3101 AACACGCCGT AGAACAGGGC GGCAGCAATC TGGAAAACCT GATGGTCGAA 

3151 CTGGATGCCT CCGAATCATC CGCAACACCC GAGACGGTTG AAACTGCGGC 

3 2 01 AGCCGACCGC ACAGATATGC CGGGCATCCG CCCCTACGGC GCAACTTTCC 
3251 GCGCAGCGGC AGCCGTACAG CATGCGAATG CCGCCGACGG TGTACGCATC 

25 33 01 TTCAACAGTC TCGCCGCTAC CGTCTATGCC GACAGTACCG CCGCCCATGC 

3351 CGATATGCAG GGACGCCGCC TGAAAGCCGT AT CGG AC GGG T TGGAC C AC A 

3401 ACGGC AC GGG TCTGCGCGTC ATCGCGCAAA CCCAACAGGA CGGTGGAACG 

3451 TGGGAACAGG GCGGTGTTGA AGGCAAAATG CGCGGC AGTA CCCAAACCGT 

3501 CGGCATTGCC GCGAAAACCG GCGAAAATAC GACAGCAGCC GCCACACTGG 

30 3551 GCATGGGACG CAGCACATGG AGCGAAAACA GTGCAAATGC AAAAAC CG AC 

3601 AGCATTAGTC TGTTTGCAGG CAT AC GGC AC GATGCGGGCG ATATCGGCTA 

3651 TCTCAAAGGC CTGTTCTCCT ACGGACGCTA CAAAAACAGC ATCAGCCGCA 

3701 GCACCGGTGC GGACGAACAT GCGGAAGGCA GCGTCAACGG CACGCTGATG 

37 51 CAGCTGGGCG CACTGGGCGG TGT C AACGTT CCGTTTGCCG CAACGGGAGA 

35 3 801 TTTGACGGTC GAAGGCGGTC TGCGCTACGA CCTGCTCAAA CAGGATGCAT 

3851 TCGCCGAAAA AGGCAGTGCT TTGGGCTGGA GCGGCAACAG CCTCACTGAA 

3901 GGCACGCTGG TCGGACTCGC GGGTCTGAAG CTGTCGCAAC CCTTGAGCGA 

3951 TAAAGCCGTC CTGTTTGCAA CGGCGGGCGT GGAACGCGAC CTGAACGGAC 

4001 GC G AC T AC AC GGTAACGGGC GGCTTTACCG GCGCGACTGC AGCAACCGGC 

40 4051 AAGACGGGGG CACGCAATAT GCCGCACACC CGTCTGGTTG CCGGCCTGGG 

4101 CGCGGATGTC GAATTCGGCA ACGGC TGGAA CGGCTTGGCA CGTTACAGCT 

4151 ACGCCGGTTC CAAACAGTAC GGCAACCACA GCGGACGAGT CGGCGTAGGC 

42 01 TACCGGTTCT GACTCGAG 

45 1 MKHFPSKVLT TAILATFCSG ALAATNDDDV KKAATVAIAA AYNNGQEING 

51 FKAGET I YD I DEDGTITKKD ATAADVEADD FKGLGLKKW TNLTKTVNEN 

101 KQNVDAKVKA AESEIEKLTT KLADTDAALA DTDAALD AT T NALNKLGENI 

151 TTFAEETKTN IVKIDEKLEA VADTVDKHAE AFNDIADSLD ETNTKADEAV 

2 01 KTANEAKQTA EETKQNVDAK VKAAETAAGK AEAAAGTANT AADKAEAVAA 
50 251 KVTDIKADIA TNKDNIAKKA NSADVYTREE SDSKFVRIDG LNATTEKLDT 

3 01 RLASAEKSIA DHDTRLNGLD KTVSDLRKET RQGLAEQAAL SGLFQPYNVG 
351 GSGGGGTSAP DFNAGGTGIG SNSRATTAKS AAVSYAGIKN EMCKDRSMLC 
401 AGRDDVAVTD RDAKINAPPP NLHTGDFPNP NDAYKNLINL KPAIEAGYTG 
451 RGVEVGIVDT GESVGSISFP ELYGRKEHGY NENYKNYTAY MRKEAPEDGG 

55 5 01 GKDIEASFDD EAVIETEAKP TDIRHVKEIG HIDLVSHIIG GRSVDGRPAG 

551 GIAPDATLHI MNTNDETKNE MMVAAIRNAW VKXGERGVRI VNNSFGTTSR 

601 AGT ADLFQ I A NSEEQYRQAL LDYSGGDKTD EGIRLMQQSD YGNLSYHIRN 

651 KNMLFIFSTG NDAQAQPNTY ALLPFYEKDA QKGIITVAGV DRSGEKFKRE 

7 01 MYGEPGTEPL EYGSNHCGIT AMWCLSAPYE ASVRFTRTNP IQIAGTSFSA 

60 751 PIVTGTAALL LQKYPWMSND NLRTTLLTTA QDIGAVGVDS KFGWGLLDAG 

801 KAMNGPASFP FGDFTADTKG TSDIAYSFRN DISGTGGLIK KGGSQLQLHG 

851 NNTYTGKTII EGGSLVLYGN NKSDMRVETK GALIYNGAAS GGSLNSDGIV 

901 YLADTDQSGA NETVHIKGSL QLDGKGTLYT RLGKLLKVDG TAIIGGKLYM 

951 SARGKGAGYL NSTGRRVPFL SAAKIGQDYS FFTNIETDGG LLASLDSVEK 

65 1001 TAGSEGDTLS YYVRRGNAAR TASAAAHSAP AGLKHAVEQG GSNLENLMVE 

1051 LDASESSATP ETVETAAADR TDMPGIRPYG ATFRAAAAVQ HANAADGVRI 

1101 FNSLAATVYA DSTAAHADMQ GRRLKAVSDG LDHNGTGIiRV IAQTQQDGGT 
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1151 
1201 
1251 
1301 
1351 
1401 



WEQGGVEGKM 
SISLFAGIRH 
QLGALGGVNV 
GTLVGLAGLK 
KTGARMMPHT 
YRF* 



RGSTQTVGIA 
DAGDIGYLKG 
PFAATGDLTV 
LSQPLSDKAV 
RLVAGLGADV 



AKTGENTTAA 
LFSYGRYKNS 
EGGLRYDLLK 
LFATAGVERD 
EFGNGWNGLA 



ATLGMGRSTW 
ISRSTGADEH 
QDAFAEKGSA 
LNGRDYTVTG 
RYSYAGSKQY 



S ENS AN AKT D 
AEGSVNGTLtM 
LGWSGNSLTE 
GFTGATAATG 
GNHSGRVGVG 



It will be understood that the invention has been described by way of example only and 
modifications may be made whilst remaining within the scope and spirit of the invention. For 
10 instance, the use of proteins from other strains is envisaged [e.g. see WO00/66741 for 
polymorphic sequences for ORF4, ORF40, ORF46, 225, 235, 287, 519, 726, 919 and 953]. 



EXPERIMENTAL DETAILS 

FPLC protein purification 
15 The following table summarises the FPLC protein purification that was used: 



Protein 


PI 


Column 


Buffer 


pH 


Protocol 


j untagged 


6.23 


Mono Q 


Tris 


8.0 


A 


128 i unta £g ed 


5.04 


Mono Q 


Bis-Tris propane 


6.5 


A 


406. 1L 


7.75 


Mono Q 


Diethanolamine 


9.0 


B 


576. 1L 


5.63 


Mono Q 


Tris 


7.5 


B 


^gg untagged 


8.79 


Mono S 


Hepes 


7.4 


A 


72g unta gg ed 


4.95 


Hi-trap S 


Bis-Tris 


6.0 


A 


gjguntagged 


10.5(-leader) 


Mono S 


Bicine 


8.5 


C 


919Lorf4 


10.4(-leader) 


Mono S 


Tris 


8.0 


B 


920L 


6.92(-leader) 


Mono Q 


Diethanolamine 


8.5 


A 


953L 


7.56(-leader) 


Mono S 


MES 


6.6 


D 


gg2 unta sg ed 


4.73 


Mono Q 


Bis-Tris propane 


6.5 


A 


919-287 


6.58 


Hi-trap Q 


Tris 


8.0 


A 


953-287 


4.92 


Mono Q 


Bis-Tris propane 


6.2 


A 



Buffer solutions included 20-120 mM NaCl, 5.0 mg/ml CHAPS and 10% v/v glycerol. The 
dialysate was centrifuged at 13000g for 20 min and applied to either a mono Q or mono S 
FPLC ion-exchange resin. Buffer and ion exchange resins were chosen according to the pi of 
the protein of interest and the recommendations of the FPLC protocol manual [Pharmacia: 
20 FPLC Ion Exchange and Chromatofocussing; Principles and Methods, Pharmacia 
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Publication]. Proteins were eluted using a step-wise NaCl gradient. Purification was 
analysed by SDS-PAGE and protein concentration determined by the Bradford method. 

The letter in the 'protocol 5 column refers to the following: 

FPLC-A: Clones 121.1, 128.1, 593, 726, 982, periplasmic protein 920L and hybrid proteins 
5 919-287, 953-287 were purified from the soluble fraction of E.coli obtained after disruption 
of the cells. Single colonies harbouring the plasmid of interest were grown overnight at 37°C 
in 20 ml of LB/ Amp (100 ]ug/ml) liquid culture. Bacteria were diluted 1:30 in 1.0 L of fresh 
medium and grown at either 30°C or 37°C until the OD550 reached 0.6-08. Expression of 
recombinant protein was induced with IPTG at a final concentration of 1.0 niM. After 

10 incubation for 3 hours, bacteria were harvested by centrifugation at 8000g for 15 minutes at 
4°C. When necessary cells were stored at -20°C. All subsequent procedures were performed 
on ice or at 4°C. For cytosolic proteins (121.1, 128.1, 593, 726 and 982) and periplasmic 
protein 920L, bacteria were resuspended in 25 ml of PBS containing complete protease 
inhibitor (Boehringer-Mannheim). Cells were lysed by by sonication using a Branson 

15 Sonifier 450. Disrupted cells were centrifuged at 8000g for 30 min to sediment unbroken 
cells and inclusion bodies and the supernatant taken to 35% v/v saturation by the addition of 
3.9 M (NH4) 2 S04. The precipitate was sedimented at 8000g for 30 minutes. The supernatant 
was taken to 70% v/v saturation by the addition of 3.9 M (NFL^SCU and the precipitate 
collected as above. Pellets containing the protein of interest were identified by SDS-PAGE 

20 and dialysed against the appropriate ion-exchange buffer (see below) for 6 hours or 
overnight. The periplasmic fraction from E.coli expressing 953L was prepared according to 
the protocol of Evans et. al [Infect.Immun. (1974) 10:1010-1017] and dialysed against the 
appropriate ion-exchange buffer. Buffer and ion exchange resin were chosen according to 
the pi of the protein of interest and the recommendations of the FPLC protocol manual 

25 (Pharmacia). Buffer solutions included 20 mM NaCl, and 10% (v/v) glycerol. The dialysate 
was centrifuged at 13000g for 20 min and applied to either a mono Q or mono S FPLC ion- 
exchange resin. Buffer and ion exchange resin were chosen according to the pi of the protein 
of interest and the recommendations of the FPLC protocol manual (Pharmacia). Proteins 
were eluted from the ion-exchange resin using either step-wise or continuous NaCl 

30 gradients. Purification was analysed by SDS-PAGE and protein concentration determined by 
Bradford method. Cleavage of the leader peptide of periplasmic proteins was demonstrated 
by sequencing the NH 2 -terminus (see below). 
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FPLC-B: These proteins were purified from the membrane fraction of E.coli. Single 
colonies harbouring the plasmid of interest were grown overnight at 37 °C in 20 ml of 
LB/Amp (100 |ug/ml) liquid culture. Bacteria were diluted 1:30 in 1.0 L of fresh medium. 
Clones 406. 1L and 919LOrf4 were grown at 30°C and Orf25L and 576. 1L at 37°C until the 
5 OD 55 o reached 0.6-0.8. In the case of 919LOrf4, growth at 30°C was essential since 
expression of recombinant protein at 37°C resulted in lysis of the cells. Expression of 
recombinant protein was induced with IPTG at a final concentration of 1.0 mM. After 
incubation for 3 hours, bacteria were harvested by centrifugation at 8000g for 15 minutes at 
4°C. When necessary cells were stored at -20 °C. All subsequent procedures were performed 

10 at 4°C. Bacteria were resuspended in 25 ml of PBS containing complete protease inhibitor 
(Boehringer-Mannheim) and lysed by osmotic shock with 2-3 passages through a French 
Press. Unbroken cells were removed by centrifugation at 5000g for 15 min and membranes 
precipitated by centrifugation at lOOOOOg (Beckman Ti50, 38000rpm) for 45 minutes. A 
Dounce homogenizer was used to re-suspend the membrane pellet in 7.5 ml of 20 mM Tris- 

15 HC1 (pH 8.0), 1.0 M NaCl and complete protease inhibitor. The suspension was mixed for 2- 
4 hours, centrifuged at lOOOOOg for 45 min and the pellet resuspended in 7.5 ml of 20mM 
Tris-HCl (pH 8.0), 1.0M NaCl, 5.0mg/ml CHAPS, 10% (v/v) glycerol and complete protease 
inhibitor. The solution was mixed overnight, centrifuged at lOOOOOg for 45 minutes and the 
supernatant dialysed for 6 hours against an appropriately selected buffer. In the case of 

20 Orf25.L, the pellet obtained after CHAPS extraction was found to contain the recombinant 
protein. This fraction, without further purification, was used to immunise mice. 

FPLC-C: Identical to FPLC-A, but purification was from the soluble fraction obtained after 
permeabilising E.coli with polymyxin B, rather than after cell disruption. 

FPLC-D: A single colony harbouring the plasmid of interest was grown overnight at 37°C 
25 in 20 ml of LB/Amp (100 |_ig/ml) liquid culture. Bacteria were diluted 1:30 in 1.0 L of fresh 
medium and grown at 30°C until the OD550 reached 0.6-0.8. Expression of recombinant 
protein was induced with IPTG at a final concentration of LOmM. After incubation for 3 
hours, bacteria were harvested by centrifugation at 8000g for 15. minutes at 4°C. When 
necessary cells were stored at -20 °C. All subsequent procedures were performed on ice or at 
30 4°C. Cells were resuspended in 20mM Bicine (pH 8.5), 20mM NaCl, 10% (v/v) glycerol, 
complete protease inhibitor (Boehringer-Mannheim) and disrupted using a Branson Sonifier 
450. The sonicate was centrifuged at 8000g for 30 min to sediment unbroken cells and 
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inclusion bodies. The recombinant protein was precipitated from solution between 35% v/v 
and 70% v/v saturation by the addition of 3.9M (NH 4 ) 2 S0 4 . The precipitate was sedimented 
at 8000g for 30 minutes, resuspended in 20 mM Bicine (pH 8.5), 20 mM NaCl, 10% (v/v) 
glycerol and dialysed against this buffer for 6 hours or overnight. The dialysate was 
5 centrifuged at 13000g for 20 min and applied to the FPLC resin. The protein was eluted from 
the column using a step-wise NaCl gradients. Purification was analysed by SDS-PAGE and 
protein concentration determined by Bradford method. 

Cloning strategy and oligonucleotide design 

Genes coding for antigens of interest were amplified by PCR, using oligonucleotides 
10 designed on the basis of the genomic sequence of N. meningitidis B MC58. Genomic DNA 
from strain 2996 was always used as a template in PCR reactions, unless otherwise specified, 
and the amplified fragments were cloned in the expression vector pET21b-h (Novagen) to 
express the protein as C-terminal His-tagged product, or in pET~24b+(Novagen) to express 
the protein in 'untagged' form {e.g. AG 287K). 

15 Where a protein was expressed without a fusion partner and with its own leader peptide (if 
present), amplification of the open reading frame (ATG to STOP codons) was performed. 

Where a protein was expressed in 'untagged' form, the leader peptide was omitted by 
designing the 5'~end amplification primer downstream from the predicted leader sequence. 

The melting temperature of the primers used in PCR depended on the number and type of 
20 hybridising nucleotides in the whole primer, and was determined using the formulae: 



The melting temperatures of the selected oligonucleotides were usually 65-70°C for the 
whole oligo and 50-60°C for the hybridising region alone. 

25 Oligonucleotides were synthesised using a Perkin Elmer 394 DNA/RNA Synthesizer, eluted 
from the columns in 2.0ml NH4OH, and deprotected by 5 hours incubation at 56°C. The 
oligos were precipitated by addition of 0.3M Na-Acetate and 2 volumes ethanol. The 
samples were centrifuged and the pellets resuspended in water. 



T m i = 4(G+C)+2(A+T) 



(tail excluded) 



T m2 = 64.9 + 0.41 (% GC) - 600/N 



(whole primer) 
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Sequences 


Restriction 
site 


OrflL 


Fwd 


CCjCCjCj A 1 tCUL I AGC-AAAACAACCCjACAAACGCj 


Nhel 


Rev 


CCCCjC 1 CCjACj- 1 1 ACCACjCOCj 1 AuLL 1 A 


Xhol 


Orfl 


Fwd 


CTAGCTAGC-GGACACACTTATTTCGGCATC 


Nhel 


Rev 


CCCGCTCGAG- Tl ACCAGCGGT AGCCTAATTTG 


Xhol 


OrflLOmpA 


Fwd 




Ndel-(Nhel) 


Rev 


CCCGCTCGAG- 


Xhol 


Orf4L 


Fwd 


CGCGGATCCCATATG-AAAACCTTCTTCAAAACC 


Ndel 


Rev 


CCCGCTCGAG-TTATTTGGCTGCGCCTTC 


Xhol 


OrfML 


Fwd 
Rev 


GCGGCATTAAT-ATGTTGAGAAAATTGTTGAAATGG 
uCGGCCTCGAG-TTATTTTTTCAAAATATATTTGC 


Asel 
Xhol 


Orf9-lL 


Fwd 
Rev 


GCGGCCATATG-TTACCTAACCGTTTCAAAATGT 

GCGGCCTCGAG-TTATTTCCGAGGTTTTCGGG 

CGCGGATCCCATATG-ACACGCTTCAAATATTC 


Ndel 
Xhol 


Orf23L 


Fwd 


Ndel 


Rev 


CCCGCTCGAG-TTATTTAAACCGATAGGTAAA 
CGCGGATCCCATATG-GGCAGGGAAGAACCGC 


Xhol 
Ndel 


Orf25-l His 


Fwd 


Rev 


GCCCAAGCTT-ATCGATGGAATAGCCGCG 


Hindlll 


Orf29-l b-His 
(MC58) 


Fwd 


CGCGGATCCGCTAGC-AACGGTTTGGATGCCCG 


Nhel 


Rev 


CCCGCTCGAG-TTTGTCTAAGTTCCTGATAT 
CCCGCTCGAG-ATTCCCACCTGCCATC 


Xhol 


Orf29-l b-L 
(MC58) 


Fwd 


CGCGGATCCGCTAGC-ATGAATTTGCCTATTCAAAAAT 


Nhel 


Rev 


CCCGC 1 CGAG-T1 A ATI CCC ACC l GCC Al C 


Xhol 


Orf29-l c-His 
(MC58) 


Fwd 


CGCGGATCCGCTAGC-ATGAATTTGCCTATTCAAAAAT 


Nhel 


Rev 


CCCGCTCGAG-TTGGACGATGCCCGCGA 


Xhol 


Orf29-l c-L 
(MC58) 
Orf25L 

Orf37L 


Fwd 


CGCGGATCCGCTAGC-ATGAATTTGCCTATTCAAAAAT 


Nhel 


Rev 


CCCGCTCGAG-TTATTGGACGATGCCCGC 


Xhol 


Fwd 


CGCGGATCCCATATG-TATCGCAAACTGATTGC 


Ndel 


Rev 


CCCGCTCGAG-CTAATCGATGGAATAGCC 


Xhol 


Fwd 


CGCGGATCCCATATG-AAACAGACAGTCAAATG 


Ndel 


Rev 


CCCGCTCGAG-TCAATAACCCGCCTTCAG 


Xhol 


Orf38L 


Fwd 


CGCGGATCCCATATG- 

TTACGTTTGACTGCTTTAGCCGTATGCACC 


Ndel 


Rev 


CCCGCTCGAG- 

TTATTTTGCCGCGTTAAAAGCGTCGGCAAC 


Xhol 


Orf40L 


Fwd 


CGCGGATCCCATATG-AACAAAATATACCGCAT 


Ndel 


Rev 


CCCGCTCGAG-TTACCACTGATAACCGAC 


Xhol 


Orf40-2-His 


Fwd 


CGCGGATCCCATATG-ACCGATGACGACGATTTAT 


Ndel 


Rev 


GCCCAAGCTT-CCACTGATAACCGACAGA 


Hindlll 




Fwd 






Rev 


GCCCAAGCTT-TTACCACTGATAACCGAC 


Hindlll 


Orf46-2L 


Fwd 


GGGAATTCCATATG-GGCATTTCCCGCAAAATATC 


Ndel 


Rev 


CCCGCTCGAG-TTATTTACTCCTATAACGAGGTCTCTTAAC 


Xhol 


Orf46-2 


Fwd 


GGGAATTCCATATG-TCAGATTTGGCAAACGATTCTT 


Ndel 


Rev 


CCCGCTCGAG-TTATTTACTCCTATAACGAGGTCTCTTAAC 


Xhol 


Orf46.1L 


Fwd 


GGGAATTCCATATG-GGCATTTCCCGCAAAATATC 


Ndel 
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Rev 


CCCGCTCGAG-TTACGTATCATATTTCACGTGC 


Xhol 


orf46. (His-GST) 


Fwd 


GGGAATTCCATATGCACGTGAAATATGATACGAAG 


BamHI-Ndel 


Rev 


CCCGCTCGAGTTTACTCCTATAACGAGGTCTCTTAAC 


Xhol 


orf46.1-His 


Pwd 


p T nn a a ttpp ata TnTf aha TTTr T nr aaapha ttptt 


iNuei 


JLvC V 


PPPfrPTPP A frPfrT A TP A T ATTTP A PPTHP 




orf46.2-His 


Fwd 


GGGAATTCCATATGTCAGATTTGGCAAACGATTCTT 


Ndel 


Rev 


CCCGCTCGAGTTTACTCCTATAACGAGGTCTCTTAAC 


Xhol 


Orf65-l-(His/GST) 
(MC58) 

Orf72-lL 


Fwd 


CGCGGATCCCATATG-CAAAATGCGTTCAAAATCCC 


BamHI-Ndel 


Rev 


CGCGGATCCCATATG-AACAAAATATACCGCAT 
CCCGCTCGAG -TTTGCTTTCGATAGAACGG 


Xhol 


Fwd 


GCGGCCATATG-GTCATAAAATATACAAATTTGAA 


Ndel 


Rev 


GCGGCCTCGAG-TTAGCCTGAGACCTTTGCAAATT 


Xhol 


Orf76-lL 


Fwd 


GCGGCCATATG-AAACAGAAAAAAACCGCTG 


Ndel 


Rev 


GCGGCCTCGAG-TTACGGTTTGACACCGTTTTC 


Xhol 


Orf83.1L 
Orf85-2L 


Fwd 


CGCGGATCCCATATG-AAAACCCTGCTCCTC 


Ndel 


Rev 


CCCGCTCGAG-TTATCCTCCTTTGCGGC 


Xhol 


Fwd 


GCGGCCATATG-GCAAAAATGATGAAATGGG 


Ndel 


Rev 


GCGGCCTCGAG-TTATCGGCGCGGCGGGCC 


Xhol 


Orf91L (MC58) 


Fwd 
Rev 


GCGGCCATATGAAAAAATCCTCCCTCATCA 
GCGGCCTCGAGTTATTTGCCGCCGTTTTTGGC 


Ndel 
Xhol 


Orf91-His(MC58) 


Fwd 
Rev 


GCGGCCATATGGCCCCTGCCGACGCGGTAAG 
GCGGCCTCGAGTTTGCCGCCGTTTTTGGCTTTC 


Ndel 
Xhol 


Orf97-lL 


Fwd 
Rev 


GCGGCCATATG-AAACACATACTCCCCCTGA 
GCGGCCTCGAG-TTATTCGCCTACGGTTTTTTG 


Ndel 
Xhol 


Orfll9L (MC58) 


Fwd 
Rev 


GCGGCCATATGATTTACATCGTACTGTTTC 
GCGGCCTCGAGTTAGGAGAACAGGCGCAATGC 


Ndel 
Xhol 


OrfL19-His(MC58) 


Fwd 
Rev 


GCGGCCATATGTACAACATGTATCAGGAAAAC 
GCGGCCTCGAGGGAGAACAGGCGCAATGCGG 


Ndel 
Xhol 


Orfl37.1 (His- 
GST) (MC58) 


Fwd 


CGCGGATCCGCTAGCTGCGGCACGGCGGG 


BamHI-Nhel 


Rec 


CCCGCTCGAGATAACGGTATGCCGCCAG 


Xhol 


Orfl43-lL 


Fwd 
Rev 


CGCGGATCCCATATG-GAATCAACACTTTCAC 
CCCGCTCGAG-TTACACGCGGTTGCTGT 


Ndel 
Xhol 


008 


Fwd 


CGCGGATCCCATATG-AACAACAGACATTTTG 


Ndel 


Rev 


CCCGCTCGAG-TTACCTGTCCGGTAAAAG 


Xhol 


050-1(48) 


Fwd 


CGCGGATCCGCTAGC-ACCGTCATCAAACAGGAA 


Nhel 


Rev 


CCCGCTCGAG-TCAAGATTCGACGGGGA 


Xhol 


105 


Fwd 


CGCGGATCCCATATG-TCCGCAAACGAATACG 


Ndel 


Rev 


/-l/^l/~1/-^/-irp/-i / -i a m/n A /lfTi/-ir|ifi 'i^T 1 /-* A 1 1 " 1 " 1 ' 

LLLut 1 CCjACj- 1 CA(jr 1 U 1 1 C 1 CjCCACj 1 1 T 


Xnol 


111L 


Fwd 


CGCGGATCCCATATG-CCGTCTGAAACACG 


Ndel 


Rev 


CCCGCTCGAG-TTAGCGGAGCAGTTTTTC 


Xhol 


117-1 


Fwd 


CGCGGATCCCATATG-ACCGCCATCAGCC 


Ndel 


Rev 


CCCGCTCGAG-TTAAAGCCGGGTAACGC 


Xhol 


121-1 


Fwd 


GCGGCCATATG-GAAACACAGCTTTACATCGG 


Ndel 


Rev 


GCGGCCTCGAG-TCAATAATAATATCCCGCG 


Xhol 



WO 01/64922 



PCT/IB01/00452 



-83- 



122-1 


Fwd 

JVC V 


GCGGCCATATG-ATTAAAATCCGCAATATCC 


Ndel 


p^ppppptph a n tta a a TPTTfirTT aha ttch a ttthh 


A.noi 


128-1 


Fwd 
K.ev 


GCGGCCATATG-ACTGACAACGCACTGCTCC 


Ndel 


nr*nnr*r v Tr'n An tp aha pppppttptpp a a at 


A.noi 


148 

149.1L (MC58) 


Fwd 


CGCGGATCCCATATG-GCGTTAAAAACATCAAA 


Ndel 


Rev 
Fwd 
Rev 


CCCGCTCGAG-TCAGCCCTTCATACAGC 

GCGGCATTAATGGCACAAACTACACTCAAACC 

GCGGCCTCGAGTTAAAACTTCACGTTCACGCCG 


Xhol 
Asel 
Xhol 


149.1-His(MC58) 


Fwd 
Rev 


GCGGCATTAATGCATGAAACTGAGCAATCGGTGG 
GCGGCCTCGAGAAACTTCACGTTCACGCCGCCGGTAAA 


Asel 
Xhol 


205 (His-GST) 
(MC58) 


Fwd 


CGCGGATCCCATATGGGCAAATCCGAAAATACG 


BamHI-Ndel 


Rev 


CCCGCTCGAGATAATGGCGGCGGCGG 


Xhol 


206L 


Fwd 


CGCGGATCCCATATG-TTTCCCCCCGACAA 


Ndel 


Rev 


CCCGCTCGAG-TCATTCTGTAAAAAAAGTATG 


Xhol 


214 (His-GST) 
(MC58) 


Fwd 


CGCGGATCCCATATGCTTCAAAGCGACAGCAG 


BamHI-Ndel 


jvev 


nccncTcn a pttppp a TTTTrnrrfT a ptp 


AJlOl 


216 


Jrwa 


r^cmnc* a tppp ata tp pp a a tppp aha a a a pp 


iNoei 


1S.CV 


rv'pnrTppAr^ ptatapa t±TCCCYYC\ccc± 




225-1L 


Fwd 


PPPflP A TPPP ATA TPt P A TTPTTTTTTP AAA PP 


iNaei 


IvC V 


rrpfrPTrPi a n tp a pttp a p a a a PtPPtPp 


yVIlUl 


235L 


Fwd 


CnCTZn A TPPP ATA TP AAA PPTTTP A TTTT A PP 
^uLUUA 1 LLLA 1 A 1 U-AAALL 1 1 1UA1 1 I lAUU 


iNaei 


Kt/V 


PPPPPTPP A P TT A TTTPPPPTPPTPTTP 
l_A-.\_AJTv- 1 LAjr/AA T- 1 1A1 1 1 UUUL 1 1L.1 1L 


^vnoi 


243 


Fwd 


CGCGGATCCCATATG-GTAATCGTCTGGTTG 


Ndel 


Rev 


CCCGCTCGAG-CTACGACTTGGTTACCG 


Xhol 


247-1L 


Fwd 


GCGGCCATATG-AGACGTAAAATGCTAAAGCTAC 


Ndel 


Rev 


GCGGCCTCGAG-TCAAAGTGTTCTGTTTGCGC 


Xhol 


264-His 


Fwd 


GCCGCCATATG-TTGACTTTAACCCGAAAAA 


Ndel 


Rev 


GCCGCGTCGAG-GCCGGCGGTCAATACCGCCCGAA 


Xhol 


270 (His-GST) 
(MC58) 


Fwd 


CGCGGATCCCATATGGCGCAATGCGATTTGAC 


BamHI-Ndel 


Rev 


CCCGCTCGAGTTCGGCGGTAAATGCCG 


Xhol 


274L 


Fwd 


GCGGCCATATG-GCGGGGCCGATTTTTGT 


Ndel 


Rev 


GCGGCCTCGAG-TTATTTGCTTTCAGTATTATTG 


Xhol 


283L 


Fwd 


GCGGCCATATG-AACTTTGCTTTATCCGTCA 


Ndel 


Rev 


GCGGCCTCGAG-TTAACGGCAGTATTTGTTTAC 


Xhol 


285-His 


Fwd 


CGCGGATCCCATATGGGTTTGCGCTTCGGGC 


BamHI 


Rev 


GCCCAAGCTTTTTTCCTTTGCCGTTTCCG 


Hindlll 


286-His 
(MC58) 


Fwd 


CGCGGATCCCATATG-GCCGACCTTTCCGAAAA 


Ndel 


Rev 


CCCGCTCGAG-GAAGCGCGTTCCCAAGC 


Xhol 


286L 
(MC58) 


Fwd 
Rev 


CGCGGATCCCATATG-CACGACACCCGTAC 
CCCGCTCGAG-TTAGAAGCGCGTTCCCAA 


Ndel 
Xhol 


287L 


Fwd 


CTAGCTAGC-TTTAAACGCAGCGTAATCGCAATGG 


Nhel 


Rev 


CCCGCTCGAG-TCAATCCTGCTCTTTTTTGCC 


Xhol 
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287 


Fwd 


CTAGCTAGC-GGGGGCGGCGGTGGCG 


Nhel 


Rev 


CCCGCTCGAG-TCAATCCTGCTCTTTTTTGCC 


Xhol 


287LOrf4 


Fwd 


CTAGCTAGCGCTCATCCTCGCCGCC- 
TGCGGGGGCGGCGGT 


Nhel 


T> air 

Kev 


LLLuL 1 LuAu- 1 LAA 1 I OOiOl I 1 1 1 1 OOO 


AilOl 


2o7-lu 


Fwd 


tuuuuA X 00-00000000000 1 OOOO 


T> y-TTT 

rJamiii 


Rev 




AilOl 


IDA Tli^. 

287 -His 


Fwd 


O 1 AGO 1 AOO-OOGGGGGGGGG 1 OOOO 


JNiiel 


Rev 


OOOOO 1 OO AO- A 1 OO 1 OO ILllllll OOO * 


AHOl 


287-His(2996) 


Fwd 


O 1 AOL 1 AOO- 1 OOOOOOOOOOOOO 1 OOOO 


JNnei 


Rev 


CGCGO 1 OGAG- A 1 GO 1 GO lOlilill GOG 


Xhol 


Al 287-His 


rwa 


CGOGGA 1 OOOO 1 AGC-OOGGA 1 Cj 1 1 AAA 1 Luut 


iNiiei 


A2 287-His 


Fwd 


CGCGGA I OGGCTAGG-O AAGA i ATGGCGGOAG T 


JNiiel 


A3 287-His 




GGOGGA 1 LLUL 1 Aut-LrttUAA 1 LLuLAAA 1 LA 


JNnei 


A4 287-His 


rwa 


OOOOO 1 AGG-GGAAGGG 1 1 OA ill GOO 1 AA 1 GO 


JNnei 


A4 287MC58-His 


Fwd 


CGOGC 1 AOL-CjCjAAOOu 1 1 OA 1 TTGGG 1 AA 1 GO 


JNnei 


287a-His 


Fwd 


GGCC A 1 A 1 Cj- 1 1 1 AAALCjLACjLCj 1 A A 1 LCjL 


JNdel 


Rev 


OOOOO 1 LuAb-AAAA 1 1 CjL 1 AGOGGO A 1 1 LuLAuu 


Xnol 


287b-His 


Fwd 


CGCCATATG-GGAAGGGTTGATTTGGCTAATGG 


Ndel 


287b-2996-His 


Rev 


CCCGC I CGAG-CT IGlCIllAi AAATGA1 GACA 1 ATI 1 G 


Xnol 


287b-MC58-His 


Rev 


r~\{~^r^r~* r^^vr^ r** a ttt ata a a apata atata ttv" 1 a ttp a ttpp 
OOCOO 1 LUAu- 1 1 1 A 1 A AA AOA 1 A A 1 A 1 A I 1 OA 1 1 OA 1 1 OO 


Xnol 


287c-2996-His 


rwa 


CGCGCTAGC-ATGCCGCTGATTCCCGTCAATC § 


JNnei 


<287 untagged '(2996) 


Fwd 


CTAGCTAGC-GGGGGCGGCGGTGGCG 


Nhel 


Rev 


CCCGCTCGAG-TCAATCCTGCTCTTTTTTGCC 


Xhol 


AG287-His * 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel 


Rev 


CCCGCTCGAG-ATCCTGCTCTTTTTTGCC 


Xhol 


AG287K(2996) 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel 


Rev 


CCCGCTCGAG-TCAATCCTGCTCTTTTTTGCC 


Xhol 


AG 287-L 


Fwd 


CGCGGATCCGCTAGC- 

TTTCtA ACGC AGTOTH ATTGC A ATGHCTTOT ATTTTTOCC 
CTTTCAGCCTGT TCGCCCGATGTTAAATCGGCG 


Nhel 


Rev 


CCCGCTCGAG-TCAATCCTGCTCTTTTTTGCC 


Xhol 


AG 287-Orf4L 


Fwd 


CGCGGATCCGCTAGC- 

AAAACCTTCTTCAAAACCCTTTCCGCCGCCGCACTCGCG 
CTCATCCTCGCCGCCTGC TCGCCCGATGTTAAATCG 


Nhel 


Kev 




Anoi 


292L 


Fwd 


OOOOO A 1 OOOA 1 A 1 0- AAAAOO A AO 1 1 AA 1 OA AA 


JNaei 


Kev 


C^C^C^C^ r sr YC y CX A Cl TT A TTf"! A TTTTTPPPP A TTl A 
OOOOO 1 OO AO- 1 1 A 1 1 OA 11111 OOOOA 1 OA 


Tvnoi 


308-1 


rwa 


PPPPP A TPPP A T A TP TT A A A r V/~ , t~*(~ % r~ x T' A 1 T'TTT A TT^ 

OOOOOA 1 OOOA 1 A 1 0- 1 1 AAA 1 OOOO 1A1 1 1 lAIO 


in aei 


Rev 


CCCGCTCGAG-TTAATCCGCCATTCCCTG 


Xhol 


401L 


Fwd 


GCGGCCATATG-AAATTACAACAATTGGCTG 


Ndel 


Rev 


GCGGCCTCGAG-TTACCTTACGTTTTTCAAAG 


Xhol 


406L 


Fwd 


CGCGGATCCCATATG-CAAGCACGGCTGCT 


Ndel 


Rev 


CCCGCTCGAG-TCAAGGTTGTCCTTGTCTA 


Xhol 


502-1L 


Fwd 


CGCGGATCCCATATG-ATGAAACCGCACAAC 


Ndel 


Rev 


CCCGCTCGAG-TCAGTTGCTCAACACGTC 


Xhol 
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502-A (His-GST) 


Fwd 


CGCGGATCCCATATGGTAGACGCGCTTAAOCA 


Tl TTT ATJ _ T 

BamHI-Ndel 


Rev 


CCCGCTCGAGAGCTGCATGGCGGCG 


Xhol 


503-1L 


Fwd 


CGCGGATCCCATATG-GCACGGTCGTTA 1 AC 


Ndel 


Rev 


CCCGC1 CGAG-CTACCGCGC ATTCCTG 


Xhol 


519-1L 


Fwd 


GCGGCCATATG-GAATTTTTCATTATC1 TGI I 


Ndel 


Rev 


GCGGCCTCG AG-TTATTTGGCGGTTTTGC 1 GC 


Xhol 


525-1L 


Fwd 


GCGGCCATATG-AAGTATGTCCGGTTA1 1 11 1C 


Ndel 


Rev 


LjCLjLjCL^ 1 L-LjALj- 1 1 A 1 LAjLjC 1 1 Lj 1 LjL.AAL.LjLj 


Anoi 


529-(His/GST) 

/TV Jf fi£Q\ 

(MC5o) 


Fwd 


CGCGG A 1 CCLjC I ACjL- 1 CCGGCAGC AAAACCLjA 


"D TTT XTU«-»T 

Bam Ml-Nnel 


Rev 


LjCCC AAOL 1 1 -ALLiLALi 1 1 CuOAA 1 LjLjALj 


TT* JTTT 

rlmalll 


552L 


Fwd 


GCCGCCATATGTTGaATATTAAACTGAAAACCTTG 


Ndel 




Rev 


GCCGCCTCGAGTTATTTCTGATGCCTTTTCCC 


Xhol 


556L 


Fwd 


GCCGCCATATGGACAATAAGACCAAACTG 


Ndel 




Rev 


GCCGCCTCGAGTTAACGGTGCGGACGTTTC 


Xhol 


557L 


Fwd 


CGCGGATCCCATATG-AACAAACTGTTTCTTAC 


Ndel 




Rev 


CCCGCTCGAG-TCATTCCGCCTTCAGAAA 


Xhol 


564ab-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG- 

CAAGGTATCGTTGCCGACAAATCCGCACCT 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

AGCTAATTGTGCTTGGTTTGCAGATAGGAGTT 


Xhol 


564abL (MC58) 


Fwd 


CGCGGATCCCATATG- 

AACCGCACCCTGTACAAAGTTGTATTTAACAAACATC 


Ndel 


Rev 


CCCGCTCGAG- 

TTAAGCTAATTGTGCTTGGTTTGCAGATAGGAGTT 


Xhol 


564b- 

(His/GST)(MC58) 


Fwd 


CGCGGATCCCATATG- 

ACGGGAGAAAATCATGCGGTTTCACTTCATG 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

AGCTAATTGTGCTTGGTTTGCAGATAGGAGTT 


Xhol 


564c- 
(His/GST)(MC58) 


Fwd 


CGCGGATCCCATATG- 

GTTTCAGACGGCCTATACAACCAACATGGTGAAATT 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

GCGGTAACTGCCGCTTGCACTGAATCCGTAA 


Xhol 


564bc- 
(His/GST)(MC58) 


Fwd 


CGCGGATCCCATATG- 

ACGGGAGAAAATCATGCGGTTTCACTTCATG 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

GCGGTAACTGCCGCTTGCACTGAATCCGTAA 


Xhol 


564d- 

(His/GST)(MC58) 


Fwd 


CGCGGATCCCATATG- 

CAAAGCAAAGTCAAAGCAGACCATGCCTCCGTAA 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

TCTTTTCCTTTCAATTATAACTTTAGTAGGTTCAATTTTG 
GTCCCC 


Xhol 


564cd- 

(His/GST)(MC58) 


Fwd 


CGCGGATCCCATATG- 

GTTTCAGACGGCCTATACAACCAACATGGTGAAATT 


T"> ~,~,T_TT TvT^I ^.T 

BamHl-Nael 


Rev 


CCCGCTCGAG- 

TCTTTTCCTTTCAATTATAACTTTAGTAGGTTCAATTTTG 
GTCCCC 


Xhol 


570L 


Fwd 


GCGGCCATATG-ACCCGTTTGACCCGCG 


Ndel 


Rev 


GCGGCCTCGAG-TCAGCGGGCGTTCATTTCTT 


Xhol 


576-1L 


Fwd 


CGCGGATCCCATATG-AACACCATTTTCAAAATC 


Ndel 




Rev 


CCCGCTCGAG-TTAATTTACTTTTTTGATGTCG 


Xhol 
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580L 


Fwd 


GCGGCCATATG-GATTCGCCCAAGGTCGG 


Ndel 


Rev 


GCGGCCTCGAG-CTACACTTCCCCCGAAGTGG 


Xhol 


583L 


Fwd 


CGCGGATCCCATATG-ATAGTTGACCAAAGCC 


Ndel 


Rev 


CCCGCTCGAG-TTATTTTTCCGATTTTTCGG 


Xhol 


593 


Fwd 


GCGGCCATATG-CTTGAACTGAACGGACT 


Ndel 


Rev 


GCGGCCTCGAG-TCAGCGGAAGCGGACGATT 


Xhol 


650 (His-GST) 
(MC58) 


Fwd 


CGCGGATCCCATATGTCCAAACTCAAAACCATCG 


BamHI-Ndel 


Rev 


CCCGCTCGAGGCTTCCAATCAGTTTGACC 


Xhol 


652 


Fwd 


GCGGCCATATG-AGCGCAATCGTTGATATTTTC 


Ndel 


Rev 


GCGGCCTCGAG-TTATTTGCCCAGTTGGTAGAATG 


Xhol 


664L 


Fwd 


GCGGCCATATG-GTGATACATCCGCACTACTTC 


Ndel 


Rev 


GCGGCCTCGAG-TCAAAATCGAGTTTTACACCA 


Xhol 


726 


Fwd 


GCGGCCATATG-ACCATCTATTTCAAAAACGG 


Ndel 


Rev 


GCGGCCTCGAG-TCAGCCGATGTTTAGCGTCCATT 


Xhol 


741-His(MC58) 


Fwd 


CGCGGATCCCATATG-AGCAGCGGAGGGGGTG 


Ndel 


Rev 


CCCGCTCGAG-TTGCTTGGCGGCAAGGC 


Xhol 


AG741-His(MC58) 


Fwd 


CGCGGATCCCATATG-GTCGCCGCCGACATCG 


Ndel 


Rev 


CCCGCTCGAG-TTGCTTGGCGGCAAGGC 


Xhol 


686-2-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG-GGCGGTTCGGAAGGCG 


BamHI-Ndel 


Rev 


CCCGCTCGAG-TTGAACACTGATGTCTTTTCCGA 


Xhol 


719-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCGCTAGC-AAACTGTCGTTGGTGTTAAC 


BamHI-Nhel 


Rev 


CCCGCTCGAG-TTGACCCGCTCCACGG 


Xhol 


730-His (MC58) 


Fwd 


GCCGCCATATGGCGGACTTGGCGCAAGACCC 


Ndel 




Rev 


GCCGCCTCGAGATCTCCTAAACCTGTTTTAACAATGCCG 


Xhol 


730A-His (MC58) 


Fwd 


GCCGCCATATGGCGGACTTGGCGCAAGACCC 


Ndel 




Rev 


GCGGCCTCGAGCTCCATGCTGTTGCCCCAGC 


Xhol 


730B-His (MC58) 


Fwd 


GCCGCCATATGGCGGACTTGGCGCAAGACCC 


Ndel 




Rev 


GCGGCCTCGAGAAAATCCCCGCTAACCGCAG 


Xhol 


741-His 

(MC58) 


Fwd 


CGCGGATCCCATATG-AGCAGCGGAGGGGGTG 


Ndel 


Rev 


CCCGCTCGAG-TTGCTTGGCGGCAAGGC 


Xhol 


AG741-His 
(MC58) 


Fwd 


CGCGGATCCCATATG-GTCGCCGCCGACATCG 


Ndel 


Rev 


CCCGCTCGAG-TTGCTTGGCGGCAAGGC 


Xhol 


743 (His-GST) 


Fwd 


CGCGGATCCCATATGGACGGTGTTGTGCCTGTT 


BamHI-Ndel 


Rev 


CCCGCTCGAGCTTACGGATCAAATTGACG 


Xhol 


757 (His-GST) 
(MC58) 


Fwd 


CGCGGATCCCATATGGGCAGCCAATCTGAAGAA 


BamHI-Ndel 


Rev 


CCCGCTCGAGCTCAGCTTTTGCCGTCAA 


Xhol 


/ J7"li 1?V VTkJ A 

(MC58) 


Fwd 


cocnn a Teener a c\c t a rTr a tpp a TTnTrrnr 


J3 cinini-rN uei 


Rev 


CCCGCTCGAG-CCAGTTGTAGCCTATTTTG 


Xhol 


759L 
(MC58) 


Fwd 


CGCGGATCCGCTAGC-ATGCGCTTCACACACAC 


Nhel 


Rev 


CCCGCTCGAG-TTACCAGTTGTAGCCTATTT 


Xhol 


760-His 


Fwd 
Rev 
Fwd 


GCCGCCATATGGCACAAACGGAAGGTTTGGAA 

GCCGCCTCGAGAAAACTGTAACGCAGGTTTGCCGTC 

GCGGCCATATGGAAGAAACACCGCGCGAACCG 


Ndel 
Xhol 
Ndel 




769-His (MC58) 
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Rev 


GCGGCCTCGAGGAACGTTTTATTAAACTCGAC 


Xhol 


907L 


Fwd 


GCGGCCATATG-AGAAAACCGACCGATACCCTA 


Ndel 


Rev 


GCGGCCTCGAG-TCAACGCCACTGCCAGCGGTTG 


Xhol 


911L 


Fwd 


CGCGGATCCCATATG-AAGAAGAACATATTGGAATTTTGGGTCGGACTG 


Ndel 


Rev 


CCCGCTCGAG-TTATTCGGCGGCTTTTTCCGCATTGCCG 


Xhol 


911LOmpA 


Fwd 


GGGAATTCCATATGAAAAAGACAGCTATCGCGATTGCA 

GTGGCACTGGCTGGTTTCGCTACCGTAGCGCAGGCCGC 

TAGC-GCTTTCCGCGTGGCCGGCGGTGC 


Ndel-(Nhel) 


Rev 


CCCCjC 1 CUACj- 1 1A1 ICCjCjCGuCIII I ICCuCAI luCCu 


Xhol 


911LPelB 


Fwd 


CATGCCATGG-CTTTCCGCGTGGCCGGCGGTGC 


Ncol 


Rev 


CCCGC 1 CuAu- 1 1 A 1 1 tuOCOuL 11111 CCGCA 1 1 GCCG 


Xhol 


913-His/GST 


Fwd 
Rev 


CGCGGATCCCATATG-TTTGCCGAAACCCGCC 
CLCCjC 1 CLr ACj- ACjCj 1 1 Cj 1 Cj 1 1 CCACjLj 1 1 Cj 


BamHI-Ndel 
Xhol 


913L 
(MC58) 


Fwd 


CGCGGATCCCATATG-AAAAAAACCGCCTATG 


Ndel 


Rev 


CCCGCTCGAG-TTAAGGTTGTGTTCCAGG 


Xhol 


919L 


Fwd 


CGCGGATCCCATATG-AAAAAATACCTATTCCGC 


Ndel 


Rev 


CCCGCTCGAG-TTACGGGCGGTATTCGG 


Xhol 


919 


Fwd 


CGCGGATCCCATATG-CAAAGCAAGAGCATCCAAA 


Ndel 


Rev 


CCCGCTCGAG-TTACGGGCGGTATTCGG 


Xhol 


919L Orf4 


Fwd 


GGG AATTCCATATGAAAACCTTCTTCAAAACCCTTTCCG 

CCGCCGCGCTAGCGCTCATCCTCGCCGCC- 

TGCCAAAGCAAGAGCATC 


Ndel-(Nhel) 


Rev 


CCCGCTCGAG-TTACGGGCGGTATTCGGGCTTCATACCG 


Xhol 


(919)-287fusion 


Fwd 


CGCGGATCCGTCGAC-TGTGGGGGCGGCGGTGGC 


Sail 


Rev 


CCCGCTCGAG-TCAATCCTGCTCTTTTTTGCC 


Xhol 


920-1L 


Fwd 


GCGGCCATATG-AAGAAAACATTGACACTGC 


Ndel 


Rev 


GCGGCCTCGAG-TTAATGGTGCGAATGACCGAT 


Xhol 


925-His/GST 
(MC5S) GATE 


Fwd 
Rev 


ggggacaagtttgtacaaaaaagcaggctTGCGGCAAGGATGCCGG 
ggggaccactttgtacaagaaagctgggtCTAAAGCAACAATGCCGG 


attBl 
attB2 




926L 


Fwd 


CGCGGATCCCATATG-AAACACACCGTATCC 


Ndel 


Rev 


CCCGCTCGAG-TTATCTCGTGCGCGCC 


Xhol 


927-2-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG-AGCCCCGCGCCGATT 


BamHI-Ndel 


Rev 


CCCGCTCGAG-TTTTTGTGCGGTCAGGCG 


Xhol 


932-His/GST 
(MC58) GATE 


Fwd 


ggggacaagtttgtacaaaaaagcaggctTGTTCGTTTGGGGGATTTAA 
ACCAAACCAAATC 


attBl 


935 (His-GST) 
(MC58) 


For 


CGCGGATCCCATATGGCGGATGCGCCCGCG 


BamHI-Ndel 


Rev 


CCCGCTCGAGAAACCGCCAATCCGCC 


Xhol 


936-1L 


Rev 


ggggaccactttgtacaagaaagctgggtTCATITTGTTTTTCCTTCTTCT 

V^LjALrLrCA-. A 1 1 


attBZ 


Fwd 


CGCGGATCCCATATG-AAACCCAAACCGCAC 


Ndel 


Rev 


CCCGCTCGAG-TCAGCGTTGGACGTAGT 


Xhol 


953L 


Fwd 


GGGAATTCCATATG-AAAAAAATCATCTTCGCCG 


Ndel 


Rev 


CCCGCTCGAG-TTATTGTTTGGCTGCCTCGAT 


Xhol 


953-fu 


Fwd 


GGGAATTCCATATG-GCCACCTACAAAGTGGACG 


Ndel 


Rev 


CGGGGATCC-TTGTTTGGCTGCCTCGATTTG 


BamHI 
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954 (His-GST) 
(MC58) 


Fwd 


CGCGGATCCCATATGCAAGAACAATCGCAGAAAG 


BamHI-Ndel 


Rev 


CCCGCTCGAGTTTTTTCGGCAAATTGGCTT 


Xhol 


958-His/GST 
(MC58) GATE 


Fwd 
Rev 


ggggacaagtttgtacaaaaaagcaggctGCCGATGCCGTTGCGG 
ggggaccactttgtacaagaaagctgggtTCAGGGTCGTTTGTTGCG 


attBl 
attB2 


961L 


Fwd 


CGCGGATCCCATATG-AAACACTTTCCATCC 


Ndel 


Rev 


CCCGC I CGACj- 1 1 ACC AC 1 CO 1 AA 1 1 G AC 


Xhol 


961 


Fwd 


CGCGGATCCCATATG-GCCACAAGCGACGAC 


Ndel 


Rev 


CCCGCTCGAG-TTACCACTCGTAATTGAC 


Xhol 


961 c (His/GST) 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACG 


BamHI-Ndel 


Rev 


CCCGCTCGAG- ACCCACGTTGTAAGGTTG 


Xhol 


961 o(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG-GCCACAAGCGACGACGA 


BamHI-Ndel 


Rev 


CCCGCTCGAG-ACCCACGTTGTAAGGTTG 


Xhol 1 


961 c-L 


Fwd 


CGCGGATCCCATATG-ATGAAACACTTTCCATCC 


Ndel 


Rev 


CCCGCTCGAG-TTAACCCACGTTGTAAGGT 


Xhol 


961 c-L 

(MC58) 


Fwd 


CGCGGATCCCATATG-ATGAAACACTTTCCATCC 


Ndel 


Rev 


CCCGCTCGAG-TTAACCCACGTTGTAAGGT 


Xhol 


961 d (His/GST) 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACG 


BamHI-Ndel 




Rev 


CCCGCTCGAG-GTCTGACACTGTTTTATCC 


Xhol 


961 Al-L 


Fwd 


CGCGGATCCCATATG-ATGAAACACTTTCCATCC 


Ndel 


Rev 


CCCGCTCGAG-TTATGCTTTGGCGGCAAAG 


Xhol 


fu961-... 


Fwd 


CGCGGATCCCATATG- GCCACAAACGACGAC 


Ndel 


Rev 


CGCGGATCC-CCACTCGTAATTGACGCC 


BamHI 


fu 961-.. . 
(MC58) 


Fwd 


CGCGGATCCCATATG-GCCACAAGCGACGAC 


Ndel 


Rev 


CGCGGATCC-CCACTCGTAATTGACGCC 


BamHI 


fu961c-... 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACGAC 


Ndel 




Rev 


CGCGGATCC -ACCCACGTTGTAAGGTTG 


BamHI 


fu961 c-L-... 


Fwd 


CGCGGATCCCATATG- ATGAAACACTTTCCATCC 


Ndel 


Rev 


CGCGGATCC -ACCCACGTTGTAAGGTTG 


BamHI 


fu (961 )- 
741(MC58)-His 


Fwd 


CGCGGATCC -GGAGGGGGTGGTGTCG 


BamHI 


Rev 


CCCGCTCGAG-TTGCTTGGCGGCAAGGC 


Xhol 


fu (961 )-983-His 


Fwd 


CGCGGATCC - GGCGGAGGCGGCACTT 


BamHI 


Rev 


CCCGCTCGAG-GAACCGGTAGCCTACG 


Xhol 


fu (961)- Orf46.1- 
His 


Fwd 


CGCGGATCCGGTGGTGGTGGT- 
TCAGATTTGGCAAACGATTC 


BamHI 


Rev 


CCCGCTCGAG-CGTATCATATTTCACGTGC 


Xhol 


fu (961 c-LV 
741(MC58) 


Fwd 


CGCGGATCC -GGAGGGGGTGGTGTCG 


BamHI 

JL-/ HA A JUL AX 




Rev 


CCCGCTCGAG-TTATTGCTTGGCGGCAAG 


Xhol 


fu (961c-L )-983 


Fwd 


CGCGGATCC - GGCGGAGGCGGCACTT 


BamHI 


Rev 


CCCGCTCGAG-TCAGAACCGGTAGCCTAC 


Xhol 


fu (961c-L)- 
Orf46.1 


Fwd 


CGCGGATCCGGTGGTGGTGGT- 
TCAGATTTGGCAAACGATTC 


BamHI 


Rev 


CCCGCTCGAG-TTACGTATCATATTTCACGTGC 


Xhol 


961-(His/GST) 


Fwd 


CGCGGATCCCATATG-GCCACAAGCGACGACG 


BamHI-Ndel 
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(MC58) 


Rev 


CCCGCTCGAG-CCACTCGTAATTGACGCC 


Xhol 


961 Al-His 
961a-(His/GS r D 


Fwd 


CGCGGATCCCATATG-GCCACAAACGACGAC 


Ndel 


Rev 


CCCGCTCGAG-TGCTTTGGCGGCAAAGTT 


Xhol 


Fwd 
Rev 


CGCGGATCCCATATG-GCCACAAACGACGAC 
CCCGCTCGAG-TTTAGCAATATTATCTTTGTTCGTAGC 


BamHI-Ndel 
Xhol 


961b-(His/GST) 


Fwd 


CGCGGATCCCATATG-AAAGCAAACCGTGCCGA 


BamHI-Ndel 


Rev 


CCCGCTCGAG-CCACTCGTAATTGACGCC 


Xhol 


961-His/GST GATE 


Fwd 
Rev 


ggggacaagtttgtacaaaaaagcaggctGCAGCCACAAACGACGACG 
ATGTTAA A A A AGC 

ggggaccactttgtacaagaaagctgggtTTACCACTCGTAATTGACGC 
CGACATGGTAGG 


attBl 
attB2 




982 


Fwd 


GCGGCCATATG-GCAGCAAAAGACGTACAGTT 


Ndel 


Rev 


GCGGCCTCGAG-TTACATCATGCCGCCCATACCA 


Xhol 


983-His (2996) 


Fwd 


CGCGGATCCGCTAGC-TTAGGCGGCGGCGGAG 


Nhel 


Rev 


CCCGCTCGAG-GAACCGGTAGCCTACG 


Xhol 


AG983-His (2996) 


Fwd 


CCCCTAGCTAGC-ACTTCTGCGCCCGACTT 


Nhel 


Rev 


CCCGCTCGAG-GAACCGGTAGCCTACG 


Xhol 


983-His 


Fwd 


CGCGGATCCGCTAGC-TTAGGCGGCGGCGGAG 


Nhel 


Rev 


CCCGCTCGAG-GAACCGGTAGCCTACG 


Xhol 


AG983-His 


Fwd 


CGCGGATCCGCTAGC-ACTTCTGCGCCCGACTT 


Nhel 


Rev 


CCCGCTCGAG-GAACCGGTAGCCTACG 


Xhol 


983L 


Fwd 


CGCGGATCCGCTAGC- 

CGAACGACCCCAACCTTCCCTACAAAAACTTTCAA 


Nhel 


Rev 
Fwd 
Rev 


CCCGCTCGAG-TCAGAACCGACGTGCCAAGCCGTTC 

GCCGCCATATGCCCCCACTGGAAGAACGGACG 

GCCGCCTCGAGTAATAAACCTTCTATGGGCAGCAG 


Xhol 
Ndel 
Xhol 


987-His (MC58) 




989-(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG-TCCGTCCACGCATCCG 


BamHI-Ndel 


Rev 


CCCGCTCGAG-TTTGAATTTGTAGGTGTATTG 
CGCGGATCCCATATG-ACCCCTTCCGCACT 


Xhol 
Ndel 


989L 

(MC58) 


Fwd 


Rev 


CCCGCTCGAG-TTATTTGAATTTGTAGGTGTAT 


Xhol 


CrgA-His 
(MC58) 


Fwd 


CGCGGATCCCATATG-AAAACCAATTCAGAAGAA 


Ndel 


Rev 
Fwd 


CCCGCTCGAG-TCCACAGAGATTGTTTCC 
GATGCCCGAAGGGCGGG 


Xhol 


PilCl-ES 

(MC58) 


Rev 


GCCCAAGCTT-TCAGAAGAAGACTTCACGC 




PilCl-His 

(MC58) 


Fwd 


CGCGGATCCCATATG-CAAACCCATAAATACGCTATT 


Ndel 


Rev 


GCCCAAGCTT-GAAGAAGACTTCACGCCAG 


Hindlll 


AlPilCl-His 

(MC58) 


Fwd 


CGCGGATCCCATATG-GTCTTTTTCGACAATACCGA 


Ndel 


Rev 


y— i /— 'y^t/~^ A A yi / ii i ii 1 1 

GCCCAAGCTT- 


HindUl 


PilCIL 

(M.C5o) 


Fwd 
ivev 


CGCGGATCCCATATG-AATAAAACTTTAAAAAGGCGG 
nccc a a nrTT tt^ 1 a a AriA a n a pttp a r^nr 1 


XT J „T 

Ndel 


AGTbp2-His 

(MC58) 


Fwd 


CGCGAATCCCATATG-TTCGATCTTGATTCTGTCGA 


Ndel 


Rev 


CCCGCTCGAG-TCGCACAGGCTGTTGGCG 


Xhol 


Tbp2-His 
(MC58) 


Fwd 


CGCGAATCCCATATG-TTGGGCGGAGGCGGCAG 


Ndel 


Rev 


CCCGCTCGAG-TCGCACAGGCTGTTGGCG 


Xhol 


Tbp2-His(MC58) 


Fwd 


CGCGAATCCCATATG-TTGGGCGGAGGCGGCAG 


Ndel 


Rev 


CCCGCTCGAG-TCGCACAGGCTGTTGGCG 


|XhoI 
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NMB0109- 
(His/GST) 

(MC58) 


Fwd 


CGCGGATCCCATATG-GCAAATTTGGAGGTGCGC 


BamHI-Ndel 


Rev 


CCCGCTCGAG-TTCGGAGCGGTTGAAGC 


Xhol 


NMB0109L 

(MC58) 


Fwd 


CGCGGATCCCATATG-CAACGTCGTATTATAACCC 


Ndel 


Rev 


CCCGCTCGAG-TTATTCGGAGCGGTTGAAG 


Xhol 


NMB0207- 

^ J. MISS/ \JfiJ -L } 

(MC58) 


Fwd 


CGCGGATCCCATATG- 
GGCATCAAAGTCGCCATCAACGGCTAC 


BamHI-Ndel 


Rev 


CCCGCTCGAG-TTTGAGCGGGCGCACTTCAAGTCCG 


Xhol 


NMB0462- 
(His/GST) 
(MC58) 


Fwd 


CuCCjLj A I CCC A 1 A 1 u-CjCjCUUC AuCCjAAAAAAAC 


Bamtil-Ndel 


Rev 


CCCGCTCGAG-GTTGGTGCCGACTTTGAT 


Xhol 


NMB0623- 
(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG-GGCGGCGGAAGCGATA 


T~i TTT TV T J „ T 

BamHI-Ndel 


Rev 


CCCGCTCGAG-TTTGCCCGCTTTGAGCC 


Xhol 


NMB0625 (His- 
GST)(MC58) 


Fwd 


CGCGGATCCCATATGGGCAAATCCGAAAATACG 


BamHI-Ndel 


Rev 


CCCGCTCGAGCATCCCGTACTGTTTCG 


Xhol 


NMB0634 

(His/GST)(MC58) 


Fwd 


ggggacaagtttgtacaaaaaagcaggctCCGACATTACCGTGTACAAC 
GGCCAACAAAGAA 


attBl 


Rev 


ggggaccactttgtacaagaaagctgggtCTTATTTCATACCGGCTTGCT 
CAAGCAGCCGG 


attBl 


NMB0776- 
His/GST (MC58) 

GATE 


Fwd 
Rev 


ggggacaagtttgtacaaaaaagcaggctGATACGGTGTTTTCCTGTAA 
AACGGACAACAA 

ggggaccactttgtacaagaaagctgggtCTAGGAAAAATCGTCATCGT 
TGAAATTCGCC 


attBl 
attBl 


NMB1115- 

His/GST (MC58) 

GATE 


Fwd 
Rev 


ggggacaagtttgtacaaaaaagcaggctATGCACCCCATCGAAACC 
ggggaccactttgtacaagaaagctgggtCTAGTCTTGCAGTGCCTC 


attBl 
attBl 


NMB1343- 
(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG- 
GGAAATTTCTTATATAGAGGCATTAG 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

GTTAATTTCTATCAACTCTTTAGCAATAAT 


Xhol 


NMB1369 (His- 


Fwd 


CGCGGATCCCATATGGCCTGCCAAGACGACA 


BamHI-Ndel 


Rev 


CCCGCTCGAGCCGCCTCCTGCCGAAA 


Xhol 


NMB1551 (His- 


Fwd 


CGCGGATCCCATATGGCAGAGATCTGTTTGATAA 


BamHI-Ndel 


Rev 


CCCGCTCGAGCGGTTTTCCGCCCAATG 


Xhol 


NMB1899 (His- 
GST) (MC58) 


Fwd 


CGCGGATCCCATATGCAGCCGGATACGGTC 


BamHI-Ndel 


Rev 


CCCGCTCGAGAATCACTTCCAACACAAAAT 


Xhol 


NMB2050- 

(His/GST) 
(MC58) 


Fwd 


CGCGGATCCCATATG-TGGTTGCTGATGAAGGGC 


BamHI-Ndel 


Rev 


CCCGCTCGAG-GACTGCTTCATCTTCTGC 


Xhol 


NMB2050L 

(MC58) 


Fwd 


CGCGGATCCCATATG-GAACTGATGACTGTTTTGC 


Ndel 


Rev 


CCCGCTCGAG-TCAGACTGCTTCATCTTCT 


Xhol 


NMB2159- 
(His/GST) 

(MC58) 
fu-AG287„.-His 


Fwd 


CGCGGATCCCATATG- 

AGCATTAAAGTAGCGATTAACGGTTTCGGC 


BamHI-Ndel 


Rev 


CCCGCTCGAG- 

GATTTTGCCTGCGAAGTATTCCAAAGTGCG 


Xhol 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel 
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Rev 


CGGGGATCC-ATCCTGCTCTTTTTTGCCGG 


BamHI 


fu-(AG287)-919- 
His 


Fwd 


CGCGGATCCGGTGGTGGTGGT- 
CAAAGCAAGAGCATCCAAACC 


BamHI 


Rev 


CCCAAGCTT-TTCGGGCGGTATTCGGGCTTC 


Hindlll 


fu-(AG287)-953- 
His 


Fwd 


CGCGGATCCGGTGGTGGTGGT- 
GCCACCTACAAAGTGGAC 


BamHI 


Rev 


GCCCAAGCTT-TTGTTTGGCTGCCTCGAT 


Hindlll 


fu-(AG287)-961~ 
His 


Fwd 


CGCGGATCCGGTGGTGGTGGT-ACAAGCGACGACG 


BamHI 


Rev 


GCCCAAGCTT-CCACTCGTAATTGACGCC 


Hindlll 


fu-(AG287)- 


Fwd 


CGCGGATCCGGTGGTGGTGGT- 
TCAGATTTGGCAAACGATTC 


BamHI 


Rev 


CCCAAGCTT-CGTATCATATTTCACGTGC 


Hindlll 


fu-(AG287-919)- 
Orf46 1 -His 


Fwd 


CCCA AGCTTGGTGGTGG1 GG TGu 1 - 
TCAGATTTGGCAAACGATTC 


XT' _ JTTT 

Hindlll 


Rev 


CCCGCTCGAG-CGTATCATATTTCACGTGC 


Xhol 


fu-(AG287- 
Orf46.1)-919-His 


Fwd 


CCCAAGCTTGGTGGTGGTGGTGGT- 
CAAAGCAAGAGCATCCAAACC 


Hindlll ! 


Rev 


CCCGCTCGAG-CGGGCGGTATTCGGGCTT 


Xhol 


fu AG2S7(394.98)- 


Fwd 


CGCGGATCCGCTAGC-CCCGATGTTAAATCGGC 


Nhel 


Rev 


CGGGGATCC-ATCCTGCTCTTTTTTGCCGG 


BamHI 


fu Orfl-(Orf46.1)- 
His 


Fwd 


CGCGGATCCGCTAGC-GGACACACTTATTTCGGCATC 


Nhel 


Rev 


CGCGGATCC-CCAGCGGTAGCCTAATTTGAT 




fu (Orfl)-Orf46.1- 
His 


Fwd 


CGCGGATCCGGTGGTGGTGGT- 
TCAGATTTGGCAAACGATTC 


BamHI 


Rev 


CCCAAGCTT-CGTATCATATTTCACGTGC 


Hindlll 


fu (919)-Orf46.1- 
His 


Fwdl 


^ — ^ * — i s — i /■ — t g—+ g—y r-r-\g—^g — t a g — i g — i g — i m ✓ — i g — i g — i g^% g—y A g 1 — t g~~* g — l A ■ I g~\ A r-fi /~^/nn-i/n A ^ — 1 

GCGGCGTCGACGGTGGCGGAGGCACTGGATCCTCAG 


Sail 


Fwd2 


GGAGGCACTGGATCCTCAGATTTGGCAAACGATTC 




Rev 


CCCGCTCGAG-CGTATCATATTTCACGTGC 


Xhol 


Fu orf46-.... 


Fwd 


GGAATTCCATATGTCAGATTTGGCAAACGATTC 


Ndel 


Rev 


CGCGGATCCCGTATCATATTTCACGTGC 


BamHI 


Fu (orf46)-287-His 


Fwd 


CGGGGATCCGGGGGCGGCGGTGGCG 


BamHI 


Rev 


CCCAAGCTTATCCTGCTCTTTTTTGCCGGC 


Hindlll 


Fu (orf46)-919-His 


Fwd 


CGCGGATCCGGTGGTGGTGGTCAAAGCAAGAGCATCCA 
AACC 


BamHI 


Rev 


CCCAAGCTTCGGGCGGTATTCGGGCTTC 


Hindlll 


Fu (orf46-919> 
287-His 


Fwd 




T-TinHTTT 


Rev 


CCCGCTCGAGATCCTGCTCTTTTTTGCCGGC 


Xhol 


Fu (orf46-287)- 
919-His 


Fwd 


CCCAAGCTTGGTGGTGGTGGTGGTCAAAGCAAGAGCAT 
CCAAACC 


Hindlll 


Rev 


CCCGCTCGAGCGGGCGGTATTCGGGCTT 


Xhol 


(AG741 )-961c-His 


Fwdl 
Fwd2 


GGAGGCACTGGATCCGCAGCCACAAACGACGACGA 
GCGGCCTCGAG-GGTGGCGGAGGCACTGGATCCGCAG 


Xhol 


Rev 


CCCGCTCGAG-ACCCAGCTTGTAAGGTTG 


Xhol 


(AG741 )-961-His 


Fwdl 
Fwd2 


GGAGGCACTGGATCCGCAGCCACAAACGACGACGA 
GCGGCCTCGAG-GGTGGCGGAGGCACTGGATCCGCAG 


Xhol 


Rev 


CCCGCTCGAG-CCACTCGTAATTGACGCC 


Xhol 
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(AG741 )-983-His 


Fwd 


GCGGCCTCGAG- 

GGATCCGGCGGAGGCGGCACTTCTGCG 




Rev 


CCCGCTCGAG-GAACCGGTAGCCTACG 


Xhol 


(AG741 )-orf46.1- 
His 


Fwdl 
Fwd2 


GGAGGCACTGGATCCTCAGATTTGGCAAACGATTC 
GCGG CGTCG ACGGTG GCGG AGGC ACTGG ATCCTC A G A 


Sail 


Rev 


CCCGCTCGAG-CGTATCATATTTCACGTGC 


Xhol 


(AG983)- 
741(MC58) -ffis 


Fwd 


GCGGCCTCGAG-GGATCCGGAGGGGGTGGTGTCGCC 


Xhol 


Rev 


CCCGCTCGAG-TTGCTTGGCGGCAAG 


Xhol 


(AG983)-961c-His 


Fwdl 
Fwd2 


GGAGGCACTGGATCCGCAGCCACAAACGACGACGA 
GCGGCCTCGAG-GGTGGCGGAGGCACTGGATCCGCAG 


Xhol 




Rev 


CCCGCTCGAG-ACCCAGCTTGTAAGGTTG 


Xhol 


(AG983)-961-His 


Fwdl 
Fwd2 


GGAGGCACTGGATCCGCAGCCACAAACGACGACGA 
GCGGCCTCG AG-GGTGGCGG AGGC ACTGG ATCCGC AG 


Xhol 


Rev 


CCCGCTCGAG-CCACTCGTAATTGACGCC 


Xhol 


(AG983)- Orf46.1- 
His 


Fwdl 
Fwd2 


GGAGGCACTGGATCCTCAGATTTGGCAAACGATTC 
GCGGCGTCGACGGTGGCGGAGGCACTGGATCCTCAGA 


Sail 


Rev 


CCCGCTCGAG-CGTATCATATTTCACGTGC 


Xhol 



* This primer was used as a Reverse primer for all the C terminal fusions of 287 to the His-tag. 

§ Forward primers used in combination with the 287-His Reverse primer, 
NB - All PGR reactions use strain 2996 unless otherwise specified (e.g. strain MC58) 



In all constructs starting with an ATG not followed by a unique Nhel site, the ATG codon is 
5 part of the Ndel site used for cloning. The constructs made using Nhel as a cloning site at the 
5' end (e.g. all those containing 287 at the N-temiinus) have two additional codons (GCT 
AGC) fused to the coding sequence of the antigen. 

Preparation of chromosomal DNA templates 

N. meningitidis strains 2996, MC58, 394.98, 1000 and BZ232 (and others) were grown to 
10 exponential phase in 100ml of GC medium, harvested by centrifugation, and resuspended in 
5ml buffer (20% w/v sucrose, 50mM Tris-HCl, 50mM EDTA, pH8). After 10 minutes 
incubation on ice, the bacteria were lysed by adding 10ml of lysis solution (50mM NaCl, 1% 
Na-Sarkosyl, 50jag/ml Proteinase K), and the suspension incubated at 37°C for 2 hours. Two 
phenol extractions (equilibrated to pH 8) and one CHCl 3 /isoamylalcohol (24:1) extraction 
15 were performed. DNA was precipitated by addition of 0.3M sodium acetate and 2 volumes 
of ethanol, and collected by centrifugation. The pellet was washed once with 70%(v/v) 
ethanol and redissolved in 4.0ml TE buffer (lOmM Tris-HCl, ImM EDTA, pH 8.0). The 
DNA concentration was measured by reading OD 2 6o. 
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PCR Amplification 

The standard PCR protocol was as follows: 200ng of genomic DNA from 2996, MC581000, 
or BZ232 strains or lOng of plasmid DNA preparation of recombinant clones were used as 
template in the presence of 40|iM of each oligonucletide primer, 400-800 |uM dNTPs 
5 solution, lx PCR buffer (including 1.5mM MgCl 2 ), 2.5 units TaqI DNA polymerase (using 
Perkin-Elmer AmpliTaQ, Boerhingher Mannheim Expand™ Long Template). 

After a preliminary 3 minute incubation of the whole mix at 95°C, each sample underwent a 
two-step amplification: the first 5 cycles were performed using the hybridisation temperature 
that excluded the restriction enzyme tail of the primer (T m i). This was followed by 30 cycles 
10 according to the hybridisation temperature calculated for the whole length oligos (T m2 ). 
Elongation times, performed at 68°C or 72°C, varied according to the length of the Orf to be 
amplified. In the case of Orf 1 the elongation time, starting from 3 minutes, was increased by 
15 seconds each cycle. The cycles were completed with a 10 minute extension step at 72°C. 

The amplified DNA was either loaded directly on a 1% agarose gel. The DNA fragment 
15 corresponding to the band of correct size was purified from the gel using the Qiagen Gel 
Extraction Kit, following the manufacturer's protocol. 

Digestion of PCR fragments and of the cloning vectors 

The purified DNA corresponding to the amplified fragment was digested with the 
appropriate restriction enzymes for cloning into pET-21b+, pET22b+ or pET-24b+. Digested 
20 fragments were purified using the QIAquick PCR purification kit (following the 
manufacturer's instructions) and eluted with either H2O or lOmM Tris, pH 8.5. Plasmid 
vectors were digested with the appropriate restriction enzymes, loaded onto a 1 .0% agarose 
gel and the band corresponding to the digested vector purified using the Qiagen QIAquick 
Gel Extraction Kit. 

25 Cloning 

The fragments corresponding to each gene, previously digested and purified, were ligated 
into pET21b+, pET22b+ or pET-24b+. A molar ratio of 3:1 fragment/vector was used with 
T4 DNA ligase in the ligation buffer supplied by the manufacturer. 

Recombinant plasmid was transformed into competent E.coli DH5 or HB101 by incubating 
30 the ligase reaction solution and bacteria for 40 minutes on ice, then at 37°C for 3 minutes. 
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This was followed by the addition of 800|ul LB broth and incubation at 37°C for 20 minutes. 
The cells were centrifuged at maximum speed in an Eppendorf microfuge, resuspended in 
approximately 200|ul of the supernatant and plated onto LB ampicillin (lOOmg/ml ) agar. 

Screening for recombinant clones was performed by growing randomly selected colonies 
5 overnight at 37°C in 4.0ml of LB broth + 100|ag/ml ampicillin. Cells were pelleted and 
plasmid DNA extracted using the Qiagen QIAprep Spin Miniprep Kit, following the 
manufacturer's instructions. Approximately ljug of each individual miniprep was digested 
with the appropriate restriction enzymes and the digest loaded onto a 1-1.5% agarose gel 
(depending on the expected insert size), in parallel with the molecular weight marker (lkb 
10 DNA Ladder, GIB CO). Positive clones were selected on the basis of the size of insert. 

Expression 

After cloning each gene into the expression vector, recombinant plasmids were transformed 
into E.coli strains suitable for expression of the recombinant protein. 1 jliI of each construct 
was used to transform E.coli BL21-DE3 as described above. Single recombinant colonies 

15 were inoculated into 2ml LB-kAmp (100)ag/ml), incubated at 37°C overnight, then diluted 
1:30 in 20ml of LB + Amp (100(jg/ml) in 100ml flasks, to give an OD 60 o between 0.1 and 0.2. 
The flasks were incubated at 30°C or at 37°C in a gyratory water bath shaker until OD600 
indicated exponential growth suitable for induction of expression (0.4-0.8 OD). Protein 
expression was induced by addition of LOmM IPTG. After 3 hours incubation at 30°C or 

20 37°C the ODeoo was measured and expression examined. 1.0ml of each sample was 
centrifuged in a microfuge, the pellet resuspended in PBS and analysed by SDS-PAGE and 
Coomassie Blue staining. 

Gateway cloning and expression 

Sequences labelled GATE were cloned and expressed using the GATEWAY Cloning 
25 Technology (GIBCO-BRL). Recombinational cloning (RC) is based on the recombination 
reactions that mediate the integration and excision of phage into and from the E.coli genome, 
respectively. The integration involves recombination of the attP site of the phage DNA within 
the attB site located in the bacterial genome (BP reaction) and generates an integrated phage 
genome flanked by attL and attR sites. The excision recombines atth and attR sites back to attP 
30 and attB sites (LR reaction). The integration reaction requires two enzymes [the phage protein 
Integrase (hit) and the bacterial protein integration host factor (IHF)] (BP clonase). The 
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excision reaction requires Int, IHF, and an additional phage enzyme, Excisionase (Xis) (LR 
clonase). Artificial derivatives of the 25-bp bacterial attB recombination site, referred to as Bl 
and B2, were added to the 5' end of the primers used in PCR reactions to amplify Neisserial 
ORFs. The resulting products were BP cloned into a "Donor vector" containing complementary 
5 derivatives of the phage attP recombination site (PI and P2) using BP clonase. The resulting 
"Entry clones" contain ORFs flanked by derivatives of the attL site (LI and L2) and were 
subcloned into expression "destination vectors" which contain derivatives of the attL- 
compatible attR sites (Rl and R2) using LR clonase. This resulted in "expression clones" in 
which ORFs are flanked by B 1 and B2 and fused in frame to the GST or His N terminal tags. 

10 The E. coli strain used for GATEWAY expression is BL21-SL Cells of this strain are induced 
for expression of the T7 RNA polymerase by growth in medium containing salt (0.3 M NaCl). 

Note that this system gives N-terminus His tags. 
Preparation of membrane proteins. 

Fractions composed principally of either inner, outer or total membrane were isolated in 
15 order to obtain recombinant proteins expressed with membrane-localisation leader 
sequences. The method for preparation of membrane fractions, enriched for recombinant 
proteins, was adapted from Filip et. al [J.Bact (1973) 115:717-722] and Davies et. al 
[J.ImmunolMetk (1990) 143:215-225]. Single colonies harbouring the plasmid of interest 
were grown overnight at 37°C in 20 ml of LB/Amp (100 |Jg/ml) liquid culture. Bacteria were 
20 diluted 1:30 in 1.0 L of fresh medium and grown at either 30°C or 37°C until the OD 550 
reached 0.6-0.8. Expression of recombinant protein was induced with IPTG at a final 
concentration of 1.0 mM. After incubation for 3 hours, bacteria were harvested by 
centrifugation at 8000g for 15 minutes at 4°C and resuspended in 20 ml of 20 mM Tris-HCl 
(pH 7.5) and complete protease inhibitors (Boehringer-Mannheim). All subsequent 
25 procedures were performed at 4°C or on ice. 

Cells were disrupted by sonication using a Branson Sonifier 450 and centrifuged at 5000g 
for 20 min to sediment unbroken cells and inclusion bodies. The supernatant, containing 
membranes and cellular debris, was centrifuged at 50000g (Beckman Ti50, 29000rpm) for 
75 min, washed with 20 mM Bis-tris propane (pH 6.5), 1.0 M NaCl, 10% (v/v) glycerol and 
30 sedimented again at 50000g for 75 minutes. The pellet was resuspended in 20mM Tris-HCl 
(pH 7.5), 2.0% (v/v) Sarkosyl, complete protease inhibitor (1.0 mM EDTA, final 
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concentration) and incubated for 20 minutes to dissolve inner membrane. Cellular debris was 
pelleted by centrifugation at 5000g for 10 rain and the supernatant centrifuged at 75000g for 
75 minutes (Beckman Ti50 5 33000rpm). Proteins 008L and 519L were found in the 
supernatant suggesting inner membrane localisation. For these proteins both inner and total 
5 membrane fractions (washed with NaCl as above) were used to immunise mice. Outer 
membrane vesicles obtained from the 75000g pellet were washed with 20 mM Tris-HCl (pH 
7.5) and centrifuged at 75000g for 75 minutes or overnight. The OMV was finally 
resuspended in 500 jil of 20 mM Tris-HCl (pH 7.5), 10% v/v glycerol. OrflL and Orf40L 
were both localised and enriched in the outer membrane fraction which was used to 
10 immunise mice. Protein concentration was estimated by standard Bradford Assay (Bio-Rad), 
while protein concentration of inner membrane fraction was determined with the DC protein 
assay (Bio-Rad). Various fractions from the isolation procedure were assayed by SDS-PAGE. 

Purification offfis-tagged proteins 

Various forms of 287 were cloned from strains 2996 and MC58. They were constructed with 

15 a C-terminus His-tagged fusion and included a mature form (aa 18-427), constructs with 
deletions (Al, A 2, A3 and A4) and clones composed of either B or C domains. For each 
clone purified as a His-fusion, a single colony was streaked and grown overnight at 37°C on 
a LB/Amp (100 |ig/ml) agar plate. An isolated colony from this plate was inoculated into 
20ml of LB/Amp (100 jug/ml) liquid medium and grown overnight at 37°C with shaking. 

20 The overnight culture was diluted 1:30 into 1.0 L LB/Amp (100 (ig/ml) liquid medium and 
allowed to grow at the optimal temperature (30 or 37°C) until the OD 550 reached 0.6-0.8. 
Expression of recombinant protein was induced by addition of IPTG (final concentration 
l.OmM) and the culture incubated for a further 3 hours. Bacteria were harvested by 
centrifugation at 8000g for 15 min at 4°C. The bacterial pellet was resuspended in 7.5 ml of 

25 either (i) cold buffer A (300 mM NaCl, 50 mM phosphate buffer, 10 mM imidazole, pH 8.0) 
for soluble proteins or (ii) buffer B (lOmM Tris-HCl, 100 mM phosphate buffer, pH 8.8 and, 
optionally, 8M urea) for insoluble proteins. Proteins purified in a soluble form included 
287-His, Al, A2, A3 and A4287-His, A4287MC58-His, 287c-His and 287cMC58-His. 
Protein 287bMC58-His was insoluble and purified accordingly. Cells were disrupted by 

30 sonication on ice four times for 30 sec at 40W using a Branson sonifier 450 and centrifuged 
at 13000xg for 30 min at 4°C. For insoluble proteins, pellets were resuspended in 2.0 ml 
buffer C (6 M guanidine hydrochloride, 100 mM phosphate buffer, 10 mM Tris- HC1, pH 7.5 
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and treated with 10 passes of a Dounce homogenizer. The homogenate was centrifuged at 
13000g for 30 min and the supernatant retained. Supernatants for both soluble and insoluble 
preparations were mixed with 150|ul Ni 2+ -resin (previously equilibrated with either buffer A 
or buffer B, as appropriate) and incubated at room temperature with gentle agitation for 30 
5 min. The resin was Chelating Sepharose Fast Flow (Pharmacia), prepared according to the 
manufacturer's protocol. The batch- wise preparation was centrifuged at 700g for 5 min at 
4°C and the supernatant discarded. The resin was washed twice (batch-wise) with 10ml 
buffer A or B for 10 min, resuspended in 1.0 ml buffer A or B and loaded onto a disposable 
column. The resin continued to be washed with either (i) buffer A at 4°C or (ii) buffer B at 

10 room temperature, until the OD 2 so of the flow-through reached 0.02-0.01 . The resin was 
further washed with either (i) cold buffer C (300mM NaCl, 50mM phosphate buffer, 20mM 
imidazole, pH 8.0) or (ii) buffer D (lOmM Tris-HCl, lOOmM phosphate buffer, pH 6.3 and, 
optionally, 8M urea) until OD280 of the flow-through reached 0.02-0.01. The His-fusion 
protein was eluted by addition of 700|ul of either (i) cold elution buffer A (300 mM NaCl, 

15 50mM phosphate buffer, 250 mM imidazole, pH 8.0) or (ii) elution buffer B (10 mM 
Tris-HCl, 100 mM phosphate buffer, pH 4.5 and, optionally, 8M urea) and fractions 
collected until the OD280 indicated all the recombinant protein was obtained. 20\il aliquots of 
each elution fraction were analysed by SDS-PAGE. Protein concentrations were estimated 
using the Bradford assay. 

20 Renaturation of denatured His-fusion proteins. 

Denaturation was required to solubilize 287bMC8, so a renaturation step was employed prior 
to immunisation. Glycerol was added to the denatured fractions obtained above to give a 
final concentration of 10% v/v. The proteins were diluted to 200 |ag/ml using dialysis buffer 
I (10% v/v glycerol, 0.5M arginine, 50 mM phosphate buffer, 5.0 mM reduced glutathione, 

25 0.5 mM oxidised glutathione, 2.0M urea, pH 8.8) and dialysed against the same buffer for 
12-14 hours at 4°C. Further dialysis was performed with buffer II (10% v/v glycerol, 0.5M 
arginine, 50mM phosphate buffer, 5.0mM reduced glutathione, 0.5mM oxidised glutathione, 
pH 8.8) for 12-14 hours at 4°C. Protein concentration was estimated using the formula: 
Protein (mg/ml) = (1.55 x OD 28 o) - (0.76 x OD 26 o) 



WO 01/64922 



PCT/IB01/00452 



-98- 

Amino acid sequence analysis. 

Automated sequence analysis of the NH2~terminus of proteins was performed on a Beckman 
sequencer (LF 3000) equipped with an on-line phenylthiohydantoin-amino acid analyser 
(System Gold) according to the manufacturer's recommendations. 

5 Immunization 

Balb/C mice were immunized with antigens on days 0, 21 and 35 and sera analyzed at day 49. 
Sera analysis - ELISA 

The acapsulated MenB M7 and the capsulated strains were plated on chocolate agar plates 
and incubated overnight at 37°C with 5% C0 2 . Bacterial colonies were collected from the 

10 agar plates using a sterile dracon swab and inoculated into Mueller-Hinton Broth (Difco) 
containing 0.25% glucose. Bacterial growth was monitored every 30 minutes by following 
OD620. The bacteria were let to grow until the OD reached the value of 0.4-0.5. The culture 
was centrifuged for 10 minutes at 4000rpm. The supernatant was discarded and bacteria 
were washed twice with PBS, resuspended in PBS containing 0.025% formaldehyde, and 

15 incubated for 1 hour at 37°C and then overnight at 4°C with stirring. IOOjliI bacterial cells 
were added to each well of a 96 well Greiner plate and incubated overnight at 4°C. The wells 
were then washed three times with PBT washing buffer (0.1% Tween-20 in PBS). 200|ul of 
saturation buffer (2.7% polyvinylpyrrolidone 10 in water) was added to each well and the 
plates incubated for 2 hours at 37°C. Wells were washed three times with PBT. 200|il of 

20 diluted sera (Dilution buffer: 1% BSA, 0.1% Tween-20, 0.1% NaN 3 in PBS) were added to 
each well and the plates incubated for 2 hours at 37°C. Wells were washed three times with 
PBT. 100|al of HRP-conjugated rabbit anti-mouse (Dako) serum diluted 1:2000 in dilution 
buffer were added to each well and the plates were incubated for 90 minutes at 37°C. Wells 
were washed three times with PBT buffer. IOOjlxI of substrate buffer for HRP (25ml of citrate 

25 buffer pH5, lOmg of O-phenildiamine and lOjul of H 2 0 2 ) were added to each well and the 
plates were left at room temperature for 20 minutes. 100(1x1 12.5% H 2 S0 4 was added to each 
well and OD490 was followed. The ELISA titers were calculated abitrarely as the dilution of 
sera which gave an OD 490 value of 0.4 above the level of preimmune sera. The ELISA was 
considered positive when the dilution of sera with OD490 of 0.4 was higher than 1:400. 

30 Sera analysis - FACS Scan bacteria binding assay 

The acapsulated MenB M7 strain was plated on chocolate agar plates and incubated 
overnight at 37°C with 5% C0 2 . Bacterial colonies were collected from the agar plates using 
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a sterile dracon swab and inoculated into 4 tubes containing 8ml each Mueller-Hinton Broth 
(Difco) containing 0.25% glucose. Bacterial growth was monitored every 30 minutes by 
following OD620- The bacteria were let to grow until the OD reached the value of 0.35-0.5. 
The culture was centrifuged for 10 minutes at 4000rpm. The supernatant was discarded and 
5 the pellet was resuspended in blocking buffer (1% BSA in PBS, 0.4% NaN 3 ) and centrifuged 
for 5 minutes at 4000rpm. Cells were resuspended in blocking buffer to reach OD620 of 0.05. 
IOOjliI bacterial cells were added to each well of a Costar 96 well plate. 100(il of diluted 
(1:100, 1:200, 1:400) sera (in blocking buffer) were added to each well and plates incubated 
for 2 hours at 4°C. Cells were centrifuged for 5 minutes at 4000rpm, the supernatant 

10 aspirated and cells washed by addition of 200|al/well of blocking buffer in each well. 100(al 
of R-Phicoerytrin conjugated F(ab) 2 goat anti-mouse, diluted 1:100, was added to each well 
and plates incubated for 1 hour at 4°C. Cells were spun down by centrifugation at 4000rpm 
for 5 minutes and washed by addition of 200|il/well of blocking buffer. The supernatant was 
aspirated and cells resuspended in 200|il/well of PBS, 0.25% formaldehyde. Samples were 

15 transferred to FACScan tubes and read. The condition for FACScan (Laser Power 15mW) 
setting were: FL2 on; FSC-H threshold:92; FSC PMT Voltage: E 01; SSC PMT: 474; Amp. 
Gains 6.1; FL-2 PMT: 586; compensation values: 0. 

Sera analysis - bactericidal assay 

N. meningitidis strain 2996 was grown overnight at 37°C on chocolate agar plates (starting 
20 from a frozen stock) with 5% C0 2 . Colonies were collected and used to inoculate 7ml 
Mueller-Hinton broth, containing 0.25% glucose to reach an OD 62 o of 0.05-0.08. The culture 
was incubated for approximately 1.5 hours at 37 degrees with shacking until the OD620 
reached the value of 0.23-0.24. Bacteria were diluted in 50mM Phosphate buffer pH 7.2 
containing lOmM MgCl 2 , lOmM CaCl 2 and 0.5% (w/v) BSA (assay buffer) at the working 
25 dilution of 10 5 CFU/ml. The total volume of the final reaction mixture was 50 |nl with 25 |il 
of serial two fold dilution of test serum, 12.5 jliI of bacteria at the working dilution, 12.5 |ul of 
baby rabbit complement (final concentration 25% ). 

Controls included bacteria incubated with complement serum, immune sera incubated with 
bacteria and with complement inactivated by heating at 56°C for 30'. Immediately after the 
30 addition of the baby rabbit complement, 10|il of the controls were plated on Mueller-Hinton 
agar plates using the tilt method (time 0). The 96-wells plate was incubated for 1 hour at 
37°C with rotation. 7jj1 of each sample were plated on Mueller-Hinton agar plates as spots, 
whereas 10|ul of the controls were plated on Mueller-Hinton agar plates using the tilt method 
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(time 1). Agar plates were incubated for 18 hours at 37 degrees and the colonies 
corresponding to time 0 and time 1 were counted. 

Sera analysis - western blots 

Purified proteins (500ng/lane), outer membrane vesicles (5jag) and total cell extracts (25jag) 
5 derived from MenB strain 2996 were loaded onto a 12% SDS-polyacrylamide gel and 
transferred to a nitrocellulose membrane. The transfer was performed for 2 hours at 150mA 
at 4°C, using transfer buffer (03% Tris base, 1.44% glycine, 20% (v/v) methanol). The 
membrane was saturated by overnight incubation at 4°C in saturation buffer (10% skimmed 
milk, 0.1% Triton X100 in PBS). The membrane was washed twice with washing buffer (3% 
10 skimmed milk, 0.1% Triton X100 in PBS) and incubated for 2 hours at 37°C with mice sera 
diluted 1:200 in washing buffer. The membrane was washed twice and incubated for 90 
minutes with a 1:2000 dilution of horseradish peroxidase labelled anti-mouse Ig. The 
membrane was washed twice with 0.1% Triton X100 in PBS and developed with the Opti- 
4CN Substrate Kit (Bio-Rad). The reaction was stopped by adding water. 

15 The OMVs were prepared as follows: N. meningitidis strain 2996 was grown overnight at 37 
degrees with 5% CO2 on 5 GC plates, harvested with a loop and resuspended in 10 ml of 
20mM Tris-HCl pH 7.5, 2 mM EDTA. Heat inactivation was performed at 56°C for 45 
minutes and the bacteria disrupted by sonication for 5 minutes on ice (50% duty cycle, 50% 
output , Branson sonifier 3 mm microtip). Unbroken cells were removed by centrifugation at 

20 5000g for 10 minutes, the supernatant containing the total cell envelope fraction recovered 
and further centrifuged overnight at SOOOOg at the temperature of 4°C . The pellet containing 
the membranes was resuspended in 2% sarkosyl, 20mM Tris-HCl pH 7.5, 2 mM EDTA and 
incubated at room temperature for 20 minutes to solubilise the inner membranes. The 
suspension was centrifuged at lOOOOg for 10 minutes to remove aggregates, the supernatant 

25 was further centrifuged at 50000g for 3 hours. The pellet, containing the outer membranes 
was washed in PBS and resuspended in the same buffer. Protein concentration was measured 
by the D.C. Bio-Rad Protein assay (Modified Lowry method), using BSA as a standard. 

Total cell extracts were prepared as follows: TV. meningitidis strain 2996 was grown 
overnight on a GC plate, harvested with a loop and resuspended in 1ml of 20mM Tris-HCl. 
30 Heat inactivation was performed at 56°C for 30 minutes. 

961 domain studies 

Cellular fractions preparation Total lysate, periplasm, supernatant and OMV of Kcoli clones 
expressing different domains of 961 were prepared using bacteria from over-night cultures or 
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after 3 hours induction with IPTG. Briefly, the periplasm were obtained suspending bacteria 
in saccarose 25% and Tris 50mM (pH 8) with polimixine 100|ag/ml. After lhr at room 
temperature bacteria were centrifuged at 13000rpm for 15 min and the supernatant were 
collected. The culture supernatant were filtered with 0.2pm and precipitated with TCA 50% 
5 in ice for two hours. After centrifugation (30 min at 13000 rp) pellets were rinsed twice with 
ethanol 70% and suspended in PBS. The OMV preparation was performed as previously 
described. Each cellular fraction were analyzed in SDS-PAGE or in Western Blot using the 
polyclonal anti-serum raised against GST-961. 

Adhesion assay Chang epithelial cells (Wong-Kilbourne derivative, clone l-5c~4, human 
10 conjunctiva) were maintained in DMEM (Gibco) supplemented with 10% heat-inactivated 
FCS, 15mM L-glutamine and antibiotics. 

For the adherence assay, sub-confluent culture of Chang epithelial cells were rinsed with 
PBS and treated with trypsin-EDTA (Gibco), to release them from the plastic support. The 
cells were then suspended in PBS, counted and dilute in PBS to 5xl0 5 cells/ml. 

15 Bacteria from over-night cultures or after induction with IPTG, were pelleted and washed 
twice with PBS by centrifuging at 13000 for 5 min. Approximately 2-3x1 0 8 (cfu) were 
incubated with 0.5 mg/ml FITC (Sigma) in 1ml buffer containing 50mM NaHC0 3 and 
lOOmM NaCl pH 8, for 30 min at room temperature in the dark. FITC-labeled bacteria were 
wash 2-3 times and suspended in PBS at l-1.5xl0 9 /ml. 200pl of this suspension (2-3xl0 8 ) 

20 were incubated with 200jil (lxl 0 5 ) epithelial cells for 30min a 37°C. Cells were than 
centrifuged at 2000rpm for 5 min to remove non-adherent bacteria, suspended in 200fxl of 
PBS, transferred to FACScan tubes and read 
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CLAIMS 

1. A method for the heterologous expression of a protein of the invention, in which (a) at 
least one domain in the protein is deleted and, optionally, (b) no fusion partner is used. 

2. The method of claim 1, in which the protein of the invention is ORF46. 

5 3. The method of claim 2, in which ORF46 is divided into a first domain (amino acids 
1-433) and a second domain (amino acids 433-608). 

4. The method of claim 2, in which the protein of the invention is 564. 

5. The method of claim 4, in which protein 564 is divided into domains as shown in Figure 
8. 

10 6. The method of claim 1 in which the protein of the invention is 961. 

7. The method of claim 6, in which protein 961 is divided into domains as shown in Figure 
12. 

8. The method of claim 1, in which the protein of the invention is 502 and the domain is 
amino acids 28 to 167 (numbered according to the MC58 sequence). 

15 9. The method of claim 1, in which the protein of the invention is 287. 

10. A method for the heterologous expression of a protein of the invention, in which (a) a 
portion of the N-terminal domain of the protein is deleted. 

11. The method of claim 9 or claim 10, in which protein 287 is divided into domains A B & 
C shown in Figure 5. 

20 12. The method of claim 1 1, in which (i) domain A, (ii) domains A and B, or (iii) domains A 
and C are deleted. 

13. The method of claim 11, wherein (i) amino acids 1-17, (ii) amino acids 1-25, (iii) amino 
acids 1-69, or (iv) amino acids 1-106, of domain A are deleted. 

14. A method for the heterologous expression of a protein of the invention, in which (a) no 
25 fusion partner is used, and (b) the protein's native leader peptide (if present) is used. 
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15. The method of claim 14, in which the protein of the invention is selected from the group 
consisting of: 111, 149, 206, 225-1, 235, 247-1, 274, 283, 286, 292, 401, 406, 502-1, 
503, 519-1, 525-1, 552, 556, 557, 570, 576-1, 580, 583, 664, 759, 907, 913, 920-1, 936- 
1, 953, 961, 983, 989, Orf4, Orf7-l, Orf9-l> Orf23, Orf25, Orf37, Orf38, Orf40, Orf40.1, 

5 Orf40.2, Orf72-l, Orf76-l, Orf85-2, Orf91, Orf97-l, Orfll9, Orfl43.1, NMB0109, 

NMB2050, 008, 105, 117-1, 121-1, 122-1, 128-1, 148, 216, 243, 308, 593, 652, 726, 
926, 982, Orf83-l and Orfl43-l. 

16. A method for the heterologous expression of a protein of the invention, in which (a) the 
protein's leader peptide is replaced by the leader peptide from a different protein and, 

10 optionally, (b) no fusion partner is used. 

17. The method of claim 16, in which the different protein is 961, ORF4, E.coli OmpA, or 
E.carotovora PelB, or in which the leader peptide is MKKYLFSAA. 

18. The method of claim 17, in which the different protein is E.coli OmpA and the protein of 
the invention is ORF1. 

15 19. The method of claim 17, in which the protein of the invention is 911 and the different 
protein is E.carotovora PelB or E.coli OmpA. 

20. The method of claim 17, in which the different protein is ORF4 and the protein of the 
invention is 287. 

21. A method for the heterologous expression of a protein of the invention, in which (a) the 
20 protein's leader peptide is deleted and, optionally, (b) no fusion partner is used. 

22. The method of claim 21, in which the protein of the invention is 919. 

23. A method for the heterologous expression of a protein of the invention, in which 
expression of a protein of the invention is carried out at a temperature at which a toxic 
activity of the protein is not manifested. 

25 24. The method of claim 23, in which protein 919 is expressed at 30°C. 

25. A method for the heterologous expression of a protein of the invention, in which protein 
is mutated to reduce or eliminate toxic activity. 
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26. The method of claim 25, in which the protein of the invention is 907, 919 or 922. 

27. The method of claim 26, in which 907 is mutated at Glu-1 17 (e.g. Glu^Gly). 

28. The method of claim 26, in which 919 is mutated at Glu-255 (e.g. Glu-+Gly) and/or 
Glu-323 (e.g. Glu->Gly). 

5 29. The method of claim 26, in which 922 is mutated at Glu-1 64 (e.g. Glu->Gly), Ser-213 
(e.g. Ser^Gly) and/or Asn-348 (e.g. Asn-^Gly). 

30. A method for the heterologous expression of a protein of the invention, in which vector 
pSM214 is used or vector pET-24b is used. 

31. The method of claim 30, in which the protein of the invention is 953 and the vector is 
10 pSM214. 

32. A method for the heterologous expression of a protein of the invention, in which a 
protein is expressed or purified such that it adopts a particular multimeric form. 

33. The method of claim 32, in which protein 953 is expressed and/or purified in monomeric 
form. 

15 34. The method of claim 32, in which protein 961 is expressed and/or purified in tetrameric 
form. 

35. The method of claim 32, in which protein 287 is expressed and/or purified in dimeric 
form. 

36. The method of claim 32, in which protein 919 is expressed and/or purified in monomeric 
20 form. 

37. A method for the heterologous expression of a protein of the invention, in which the 
protein is expressed as a lipidated protein. 

38. The method of claim 37, in which the protein of the invention is 919, 287, ORF4, 406, 
576, or ORF25. 

25 39. A method for the heterologous expression of a protein of the invention, in which (a) the 
protein's C-terminus region is mutated and, optionally, (b) no fusion partner is used. 
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40. The method of claim 39, wherein the mutation is a substitution, an insertion, or a deletion 

41. The method of claim 40, wherein the protein of the invention is 730, ORF29 or ORF46. 

42. A method for the heterologous expression of a protein of the invention, in which the 
protein's leader peptide is mutated. 

5 43. The method of claim 42, in which the protein of the invention is 919. 

44. A method for the heterologous expression of a protein, in which a poly-glycine stretch 
within the protein is mutated. 

45. The method of claim 44, wherein the protein is a protein of the invention. 

46. The method of claim 45, wherein the protein of the invention is 287, 741, 983 or Tbp2. 
10 47. The method of claim 46, wherein (Gly) 6 is deleted from 287 or 983. 

48. The method of claim 46, wherein (Gly) 4 is deleted from Tbp2 or 741 

49. The method of claim 47 or claim 48, wherein the leader peptide is also deleted. 

50. The method of any preceding claim, in which the heterologous expression is in an E.coli 
host. 

15 51. A protein expressed by the method of any preceding claim. 

52. A heterologous protein comprising the N-terminal amino acid sequence MKKYLFSAA. 
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